iframe-proxy

Adel-Moumen · 2026-04-03T15:30:26Z

This PR adds markdown instructions for agentic models (e.g. Claude Code, Codex, etc.) to help them navigate the codebase.

These files are meant to evolve over time and gradually reflect the common issues LLMs may encounter when working with the SpeechBrain codebase. This first PR is intended as a prototype in that direction.

Copilot

Pull request overview

This PR adds repository guidance Markdown aimed at helping agentic coding tools (e.g., Claude Code, Codex) understand SpeechBrain’s structure, conventions, and workflows.

Changes:

Added a new AGENTS.md guide describing project structure, core architecture concepts, recipe conventions, and common pitfalls.
Added CLAUDE.md intended to point Claude-based agents at the main instructions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Adel-Moumen · 2026-04-03T15:45:09Z

ps: I am not a prompt engineer, and therefore I believe that things could be improved (e.g. should it be more concise etc?) but I do believe that the only way to know is to start from somewhere and then slowly update/build on top.

pplantinga

Looks like a good start, my main comment is that we may want to start thinking of specific agentic workflows and design special files for this, such as "adding a new feature to speechbrain core" which could be covered by an AGENTS.md file in the speechbrain/ folder with more detailed instructions on how to run tests and write unittests and ensure everything is working. Or a "how to write a new recipe" file for the recipes folder, etc. Not sure if it is necessary now but could be nice to do while we are thinking about it.

pplantinga · 2026-04-03T15:51:18Z

+
+## Recipe conventions
+
+Every recipe lives at `recipes/{dataset}/{task}/{mdeol}` and follows this structure:


Suggested change

Every recipe lives at `recipes/{dataset}/{task}/{mdeol}` and follows this structure:

Every recipe lives at `recipes/{dataset}/{task}/{model}` and follows this structure:

pplantinga · 2026-04-03T15:55:09Z

+pytest tests/integration/ -x
+```
+
+Pre-commit hooks are configured in `.pre-commit-config.yaml` and enforce formatting/linting automatically. Always run `pre-commit run -a` before opening a PR.


Not sure how installing the pre-commit hooks interacts with agents here. Looks like this assumes the agent will always manually run the tests rather than installing the hook. Just wanted to check that this is what we want to do, as it seems there is some risk of the agent forgetting this part (which is not the end of the world but could be annoying I guess).

pplantinga · 2026-04-03T15:57:52Z

+- **Stage-dependent logic**: always check `stage` before computing validation-only metrics or applying train-only augmentation. Forgetting this causes training-time metric computation (slow) or test-time augmentation (wrong results).
+- **Batch format**: batch objects from the dataio pipeline are `PaddedBatch` instances. Access signals as `batch.sig` which returns `(tensor, lengths)` tuples. Do not index batch like a plain dict.
+- **Checkpointing**: Brain's checkpointer saves/loads modules, optimizers, schedulers, and epoch counters. If you add a new trainable module, register it with the checkpointer or it won't be saved/restored.
+- **Soundfile vs torchaudio**: SpeechBrain is migrating audio I/O from torchaudio to soundfile. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.


Suggested change

- **Soundfile vs torchaudio**: SpeechBrain is migrating audio I/O from torchaudio to soundfile. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.

- **Soundfile vs torchaudio**: SpeechBrain uses a soundfile backend for audio I/O. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.

pplantinga · 2026-04-03T15:59:03Z

+- PyTorch (core)
+- HyperPyYAML (`hyperpyyaml` package — SpeechBrain's extended YAML, separate repo at `speechbrain/HyperPyYAML`)
+- soundfile (audio I/O)
+- torchaudio (some legacy audio I/O, being phased out)


Suggested change

- torchaudio (some legacy audio I/O, being phased out)

- torchaudio (basic feature transforms, resampling, etc.)

pplantinga · 2026-04-03T17:47:21Z

+
+Every recipe wires this together in a `dataio_prep(hparams)` function — follow this pattern for new recipes.
+
+## Recipe conventions


Should we actually have additional AGENTS.md files in key top-level folders as well? Like one for recipes, one for tests, one for speechbrain folder itself? Just wondering if it might be helpful to have more specific instructions depending on what the agent is trying to do.

pplantinga · 2026-04-03T18:01:53Z

+
+- **HyperPyYAML is not plain YAML**: do not treat `.yaml` files as simple config. `!new:` instantiates objects, `!ref` resolves references. Editing these files requires understanding the tag system. If you break a `!ref` chain, training will crash at load time.
+- **Relative lengths, not absolute**: SpeechBrain passes relative lengths (0 to 1) for masking/padding. Do not pass absolute sample counts where relative lengths are expected.
+- **modules vs hparams**: objects listed under `modules:` in the YAML are registered as `nn.Module`s on the Brain (moved to device, included in DDP, saved in checkpoints). Objects accessed via `self.hparams.*` are not. Putting a trainable module only in hparams means it won't be on the right device or saved properly.


Suggested change

- **modules vs hparams**: objects listed under `modules:` in the YAML are registered as `nn.Module`s on the Brain (moved to device, included in DDP, saved in checkpoints). Objects accessed via `self.hparams.*` are not. Putting a trainable module only in hparams means it won't be on the right device or saved properly.

- **modules vs hparams**: objects listed under `modules:` in the YAML are registered as `nn.Module`s on the Brain (moved to device, included in DDP). Objects accessed via `self.hparams.*` are not. Putting a trainable module only in hparams means it won't be on the right device or saved properly.

The "saved in checkpoints" is unrelated to "modules" I think.

Adel-Moumen added 2 commits April 3, 2026 17:27

add markdown files for agentic models

d3ef381

add return line

a70a2fa

Adel-Moumen requested a review from Copilot April 3, 2026 15:30

Copilot started reviewing on behalf of Adel-Moumen April 3, 2026 15:31 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Comment thread CLAUDE.md Outdated

Comment thread AGENTS.md Outdated

Comment thread AGENTS.md Outdated

Comment thread AGENTS.md Outdated

Comment thread AGENTS.md Outdated

Comment thread AGENTS.md Outdated

Adel-Moumen and others added 6 commits April 3, 2026 16:35

Update AGENTS.md

a1ed635

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update CLAUDE.md

c1685e8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update AGENTS.md

5bfb8c5

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update AGENTS.md

f0949eb

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

zen of python + more details regarding speechbrain recipe template

273f61f

pre-commit

9740feb

Adel-Moumen requested a review from pplantinga April 3, 2026 15:44

pplantinga reviewed Apr 3, 2026

View reviewed changes

File	Description
CLAUDE.md	Adds a pointer to the agent guidance document.
AGENTS.md	Introduces a consolidated “how this repo works” guide for agentic models (structure, recipes, tests, workflow).


		## Recipe conventions

		Every recipe lives at `recipes/{dataset}/{task}/{mdeol}` and follows this structure:

	- Soundfile vs torchaudio: SpeechBrain is migrating audio I/O from torchaudio to soundfile. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.
	- Soundfile vs torchaudio: SpeechBrain uses a soundfile backend for audio I/O. Use `speechbrain.dataio.dataio.read_audio` for reading audio, not raw torchaudio calls.

	- torchaudio (some legacy audio I/O, being phased out)
	- torchaudio (basic feature transforms, resampling, etc.)


		Every recipe wires this together in a `dataio_prep(hparams)` function — follow this pattern for new recipes.

		## Recipe conventions

	- modules vs hparams: objects listed under `modules:` in the YAML are registered as `nn.Module`s on the Brain (moved to device, included in DDP, saved in checkpoints). Objects accessed via `self.hparams.*` are not. Putting a trainable module only in hparams means it won't be on the right device or saved properly.
	- modules vs hparams: objects listed under `modules:` in the YAML are registered as `nn.Module`s on the Brain (moved to device, included in DDP). Objects accessed via `self.hparams.*` are not. Putting a trainable module only in hparams means it won't be on the right device or saved properly.

Sunbelt Computer Software

PL/B Language Development and Support

Conversation

Adel-Moumen commented Apr 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Adel-Moumen commented Apr 3, 2026

Uh oh!

pplantinga left a comment

Choose a reason for hiding this comment

Uh oh!

pplantinga Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

pplantinga Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

pplantinga Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

pplantinga Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

pplantinga Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

pplantinga Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants