add markdown files for agentic models by Adel-Moumen · Pull Request #3048 · speechbrain/speechbrain · GitHub
Skip to content

add markdown files for agentic models#3048

Open
Adel-Moumen wants to merge 8 commits intodevelopfrom
add_agents_instructions
Open

add markdown files for agentic models#3048
Adel-Moumen wants to merge 8 commits intodevelopfrom
add_agents_instructions

Conversation

@Adel-Moumen
Copy link
Copy Markdown
Collaborator

This PR adds markdown instructions for agentic models (e.g. Claude Code, Codex, etc.) to help them navigate the codebase.

These files are meant to evolve over time and gradually reflect the common issues LLMs may encounter when working with the SpeechBrain codebase. This first PR is intended as a prototype in that direction.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds repository guidance Markdown aimed at helping agentic coding tools (e.g., Claude Code, Codex) understand SpeechBrain’s structure, conventions, and workflows.

Changes:

  • Added a new AGENTS.md guide describing project structure, core architecture concepts, recipe conventions, and common pitfalls.
  • Added CLAUDE.md intended to point Claude-based agents at the main instructions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
CLAUDE.md Adds a pointer to the agent guidance document.
AGENTS.md Introduces a consolidated “how this repo works” guide for agentic models (structure, recipes, tests, workflow).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CLAUDE.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Comment thread AGENTS.md Outdated
Adel-Moumen and others added 6 commits April 3, 2026 16:35
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Adel-Moumen Adel-Moumen requested a review from pplantinga April 3, 2026 15:44
@Adel-Moumen
Copy link
Copy Markdown
Collaborator Author

ps: I am not a prompt engineer, and therefore I believe that things could be improved (e.g. should it be more concise etc?) but I do believe that the only way to know is to start from somewhere and then slowly update/build on top.

Copy link
Copy Markdown
Collaborator

@pplantinga pplantinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start, my main comment is that we may want to start thinking of specific agentic workflows and design special files for this, such as "adding a new feature to speechbrain core" which could be covered by an AGENTS.md file in the speechbrain/ folder with more detailed instructions on how to run tests and write unittests and ensure everything is working. Or a "how to write a new recipe" file for the recipes folder, etc. Not sure if it is necessary now but could be nice to do while we are thinking about it.

Comment thread AGENTS.md

## Recipe conventions

Every recipe lives at `recipes/{dataset}/{task}/{mdeol}` and follows this structure:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Every recipe lives at `recipes/{dataset}/{task}/{mdeol}` and follows this structure:
Every recipe lives at `recipes/{dataset}/{task}/{model}` and follows this structure:

Comment thread AGENTS.md
pytest tests/integration/ -x
```

Pre-commit hooks are configured in `.pre-commit-config.yaml` and enforce formatting/linting automatically. Always run `pre-commit run -a` before opening a PR.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how installing the pre-commit hooks interacts with agents here. Looks like this assumes the agent will always manually run the tests rather than installing the hook. Just wanted to check that this is what we want to do, as it seems there is some risk of the agent forgetting this part (which is not the end of the world but could be annoying I guess).

Comment thread AGENTS.md
- **Stage-dependent logic**: always check `stage` before computing validation-only metrics or applying train-only augmentation. Forgetting this causes training-time metric computation (slow) or test-time augmentation (wrong results).
- **Batch format**: batch objects from the dataio pipeline are `PaddedBatch` instances. Access signals as `batch.sig` which returns `(tensor, lengths)` tuples. Do not index batch like a plain dict.
- **Checkpointing**: Brain's checkpointer saves/loads modules, optimizers, schedulers, and epoch counters. If you add a new trainable module, register it with the checkpointer or it won't be saved/restored.
- **Soundfile vs torchaudio**: SpeechBrain is migrating audio I/O from torchaudio to soundfile. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Soundfile vs torchaudio**: SpeechBrain is migrating audio I/O from torchaudio to soundfile. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.
- **Soundfile vs torchaudio**: SpeechBrain uses a soundfile backend for audio I/O. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.

Comment thread AGENTS.md
- PyTorch (core)
- HyperPyYAML (`hyperpyyaml` package — SpeechBrain's extended YAML, separate repo at `speechbrain/HyperPyYAML`)
- soundfile (audio I/O)
- torchaudio (some legacy audio I/O, being phased out)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- torchaudio (some legacy audio I/O, being phased out)
- torchaudio (basic feature transforms, resampling, etc.)

Comment thread AGENTS.md

Every recipe wires this together in a `dataio_prep(hparams)` function — follow this pattern for new recipes.

## Recipe conventions
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we actually have additional AGENTS.md files in key top-level folders as well? Like one for recipes, one for tests, one for speechbrain folder itself? Just wondering if it might be helpful to have more specific instructions depending on what the agent is trying to do.

Comment thread AGENTS.md

- **HyperPyYAML is not plain YAML**: do not treat `.yaml` files as simple config. `!new:` instantiates objects, `!ref` resolves references. Editing these files requires understanding the tag system. If you break a `!ref` chain, training will crash at load time.
- **Relative lengths, not absolute**: SpeechBrain passes relative lengths (0 to 1) for masking/padding. Do not pass absolute sample counts where relative lengths are expected.
- **modules vs hparams**: objects listed under `modules:` in the YAML are registered as `nn.Module`s on the Brain (moved to device, included in DDP, saved in checkpoints). Objects accessed via `self.hparams.*` are not. Putting a trainable module only in hparams means it won't be on the right device or saved properly.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

The "saved in checkpoints" is unrelated to "modules" I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants