Add BAML Language by codeshaunted · Pull Request #7959 · github-linguist/linguist · GitHub
Skip to content

Add BAML Language#7959

Merged
lildude merged 4 commits into
github-linguist:mainfrom
codeshaunted:main
Jun 4, 2026
Merged

Add BAML Language#7959
lildude merged 4 commits into
github-linguist:mainfrom
codeshaunted:main

Conversation

@codeshaunted

@codeshaunted codeshaunted commented May 13, 2026

Copy link
Copy Markdown
Contributor

Description

Hello, I'm a member of the BAML programming language team and we'd love for linguist to have support for BAML. This PR looks to add BAML support to linguist and supersedes and fixes the issues in #7333.

Checklist:

  • I am adding a new language.
    • The extension of the new language is used in hundreds of repositories on GitHub.com.
    • I have included a real-world usage sample for all extensions added in this PR:
      • Sample source(s): From internal projects
      • Sample license(s): From internal projects, released here under MIT.
    • I have included a syntax highlighting grammar: https://github.com/boundaryml/textMate-baml
    • I have added a color
      • Hex value: #A855F7
      • Rationale: This is the main color that we use for marketing purposes (appears in the logo).
    • I have updated the heuristics to distinguish my language from others using the same extension.

@codeshaunted codeshaunted requested a review from a team as a code owner May 13, 2026 23:15
@codeshaunted codeshaunted mentioned this pull request May 13, 2026
6 tasks
@sxlijin

sxlijin commented May 14, 2026

Copy link
Copy Markdown

@codeshaunted

Copy link
Copy Markdown
Contributor Author

Hey @lildude is there still a bar for us to clear popularity-wise? Given your comment on the previous PR I figured we were good on that front.

Also please let me know if there's anything we need to change elsewhere for this to land. Thanks!

@lildude

lildude commented May 15, 2026

Copy link
Copy Markdown
Member

The popularity requirements are detailed in the CONTRIBUTING.md file. My comment in the previous PR wasn't to suggest the popularity requirements had been met, it was to indicate I wouldn't even try assessing with each release whilst the PR feedback hasn't been addressed.

@codeshaunted

Copy link
Copy Markdown
Contributor Author

The popularity requirements are detailed in the CONTRIBUTING.md file. My comment in the previous PR wasn't to suggest the popularity requirements had been met, it was to indicate I wouldn't even try assessing with each release whilst the PR feedback hasn't been addressed.

Ah, okay, sorry about the misunderstanding. From looking at CONTRIBUTING.md I think we meet the minimum requirements if we search using this pattern: https://github.com/search?type=code&q=NOT+is%3Afork+path%3A*.baml+function

@hellovai

hellovai commented Jun 3, 2026

Copy link
Copy Markdown

hi @lildude! appreciate your guidance here, and apologies for perhaps the misunderstanding on our end. we'd love to confirm a few things:

Are these good queries to be using for metrics?
https://github.com/search?q=path%3A*.baml+NOT+is%3Afork&type=code (~3.3k files)
https://github.com/search?q=path%3A*.baml+NOT+is%3Afork+NOT+org%3Aboundaryml&type=code (~2.2k files)

We've attempted to model off of similar recently accepted PRs (like #7748) and it looks to be similar range of results.

The primary use case of our language is agentic systems, ranging from building coding agents to systems doing data processing by way of LLMs.

I'd love to share more data / update queries accordingly to see how / if we can make progress towards linguists standards for a new language.

@Alhadis

Alhadis commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

The primary use case of our language is agentic systems, ranging from building coding agents to systems doing data processing by way of LLMs.

Purely out of general interest, may I ask why you're creating a programming language that generates natural language to automate tasks and behaviour that can already be automated by programming and scripting languages?

@hellovai

hellovai commented Jun 4, 2026

Copy link
Copy Markdown

hi @Alhadis! BAML is actually the opposite of generating natural language. it attempts to add type-safety to usages of natural language and helps the model conform better to what a type-system expects w/o requiring developers to update the model.

This probably explains it best!
https://youtu.be/wD3zieaV0Yc?t=539 (10 mins)

The core principal is: people are going to write prompts. Prompts have no discipline. We can model prompts as functions and then bring the last 30 years of software discipline to what would otherwise be english.

Probably one of the most useful features of the typesystem is our JSON/Tool Calling error correction, where if the model outputs some text, we can correct it to match the type-system the user expected. (no llms, pure algorithm design, so its fast and cheap).
https://www.youtube.com/watch?v=Z9nwmtHUggY (4 mins)

does that help answer?

@Alhadis

Alhadis commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Uh yes, it does answer my question. Though this bit raises another:

We can model prompts as functions and then bring the last 30 years of software discipline to what would otherwise be English.

Doesn't the mere existence of this language (necessitated by the need to enforce "type-safety" of prosaic input) indicate that using LLMs for tasks nominally left to human programmers is a misapplication of AI technology? That is, it introduces more problems than it solves: the excuse that people can interface with a front-end in natural language kind of goes out the window once another layer of abstraction is heaped atop of it.

Moreover, the "discipline" of the past 30 years of software development practices has tended to trend towards "being lazy, time-poor, and overly-reliant on overly-complex systems that overcomplicate the problems they exist to solve".

So there's that.

@hellovai

hellovai commented Jun 4, 2026

Copy link
Copy Markdown

I'd love to follow up over email/zoom to discuss the purpose and the use case of BAML! my email is vbv@boundaryml.com and i'm pretty much always around.

If possible, can we keep this specific discussion focused on clarifying for us (as maintainers of BAML), what the community usage requirements BAML should demonstrate for linguists maintainers?

We've seen our community grow pretty substantially over the past two years (2.5k+ devs, 8k+ GH stars), and are seeing larger teams adopting BAML (some with 100+ baml files). We're just trying to make developers using our language happy :)

for example similar to what jeremy had posted here:
#7333 (comment)

... Im only here because I have 100 .baml files and when I share these to colleagues its an ugly colorless wall of text. Some syntax highlighting would make my day

@Alhadis

Alhadis commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

@hellovai You're right, I apologise for the diatribe. This is a topic I feel strongly about, so I tend to bring it up in discussions that're only peripherally-related.

Comment thread lib/linguist/languages.yml Outdated
@hellovai

hellovai commented Jun 4, 2026

Copy link
Copy Markdown

No worries at all :) with all the ai topics going around, such conversations should be more common! I genuinely would enjoy sharing what we're doing and hearing your thoughts on it as a language over a call at some point.

codeshaunted and others added 2 commits June 3, 2026 22:58
Co-authored-by: John Gardner <gardnerjohng@gmail.com>
@Alhadis

Alhadis commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

@hellovai Mate, trust me. You really don't want to hear my unfiltered thoughts on AI. 😂

@sxlijin

sxlijin commented Jun 4, 2026

Copy link
Copy Markdown

Huzzah!! 🎉

Thanks for the quick turnaround on this, y'all!

@hellovai

hellovai commented Jun 4, 2026

Copy link
Copy Markdown

appreciate the approval! three quick follow up questions:

  1. what is the release process for linguist? It seems like the most recent one was March 13, curious if we'll be making this upcoming one!
  2. any other actions required on our end?
  3. what is the process for syntax upgrades as we add new syntax? from what i read, we just push to the repo we have, and whatever github release is there will just git pull the latest submodule and it'll just update?

haha. hit me with them :) if you're game, i'm around for a bit tonight on discord: boundaryml.com/discord (just tag me and i'll be around our office hours channel). I feel like i learn a lot from different engineers on different ends of the spectrum!

@lildude

lildude commented Jun 4, 2026

Copy link
Copy Markdown
Member
  1. what is the release process for linguist? It seems like the most recent one was March 13, curious if we'll be making this upcoming one!

You don't really want the process, but the frequency, documented here.

  1. any other actions required on our end?

Nope, looks good now.

  1. what is the process for syntax upgrades as we add new syntax? from what i read, we just push to the repo we have, and whatever github release is there will just git pull the latest submodule and it'll just update?

Yup, also documented.

@lildude lildude left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Important

The changes in this PR will not appear on GitHub until the next release has been made and deployed. See here for more details.

@lildude lildude added this pull request to the merge queue Jun 4, 2026
Merged via the queue into github-linguist:main with commit cc679b0 Jun 4, 2026
5 checks passed
@lildude

lildude commented Jun 4, 2026

Copy link
Copy Markdown
Member

You don't really want the process, but the frequency, documented here.

And now I check the Enterprise server release schedule, I'll kick off the release process today in time for next week's feature freeze for GHES 3.22.

@Alhadis

Alhadis commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

And now I check the Enterprise server release schedule, I'll kick off the release process today in time for next week's feature freeze for GHES 3.22.

Awh man, I had a bunch of other languages that were blocked by this PR. 😫 I guess GtkRC and GIMP configuration files will have to wait until after ${NEXT_LINGUIST_VERSION_STRING}.

@hellovai

hellovai commented Jun 4, 2026

Copy link
Copy Markdown

I guess its not surprised how complicated the release for linguist is. that process guide was wild to read through 😂

@lildude

lildude commented Jun 4, 2026

Copy link
Copy Markdown
Member

And now I check the Enterprise server release schedule, I'll kick off the release process today in time for next week's feature freeze for GHES 3.22.

Awh man, I had a bunch of other languages that were blocked by this PR. 😫 I guess GtkRC and GIMP configuration files will have to wait until after ${NEXT_LINGUIST_VERSION_STRING}.

@Alhadis If you can get them in before about 10am UTC tomorrow, I can roll them into this release. I'm holding off until tomorrow to see if #6470 will be updated and the conflicts resolved so it can finally be merged.

@Alhadis

Alhadis commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

@lildude Challenge accepted. That just leaves me with less than 17.5 hours to prep some pull-requests (at the expense of an all-nighter), but hey, this wouldn't be the first time I've crammed to get a grammar through the door at the eleventh hour.

@Alhadis

Alhadis commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants