iframe-proxy

oliver0006 · 2026-07-03T14:17:14Z

What this does

The LoRA adapter class claimed to support nn.Conv layers but assumed nn.Linear semantics, crashing with RuntimeError: mat1 and mat2 shapes cannot be multiplied on any convolutional layer (e.g. the wav2vec2-style repro in the issue).

This PR implements real convolutional support instead of removing the claim, following the approach used by HF peft:

Down projection: a conv of the same type mirroring the pretrained layer's geometry (kernel_size, stride, padding, dilation, padding_mode), mapping onto rank channels — so both branches produce outputs of identical shape
Up projection: a pointwise (1×1) conv from rank to out_channels, zero-initialized (standard LoRA init)
Grouped convolutions (groups != 1): clear ValueError instead of a cryptic crash
No behavior change for nn.Linear (and other weight-matrix modules); checkpoint attribute names (adapter_down_proj/adapter_up_proj) are preserved so existing checkpoints keep loading

Testing

New tests/unittests/test_adapters.py with 9 tests verifying the core LoRA properties:

Identity at init (zero up-projection ⇒ adapted output == pretrained output), for Linear, Conv1d, Conv2d and Conv3d
Exact match with the manual formula base(x) + up(down(x)) * alpha/rank
Pretrained weights frozen (no grads) while adapter weights receive gradients and actually learn (one SGD step changes the output, pretrained output unchanged)
Shape equality across geometries (stride=5, padding='same', dilation=2)
groups != 1 raises ValueError
The exact AdaptedModel(all_conv=True) repro from the issue, forward + backward, with only adapter params trainable

Also added a Conv1d doctest to the LoRA docstring. All 9 tests + 3 doctests pass; ruff check, ruff format and codespell pass on the changed files.

Co-written by a human (@oliver0006) and Claude AI (Fable 5) working together.

🤖 Generated with Claude Code

The LoRA class claimed to work with nn.Conv layers but assumed nn.Linear semantics (weight shape and linear projections), crashing on any convolutional layer. The adapter projections now mirror the geometry of the pretrained convolution (following the approach of HF peft): the down projection reuses kernel/stride/padding/dilation to map onto rank channels, and the up projection is a pointwise convolution initialized to zero. Grouped convolutions raise a clear error. Linear behavior and checkpoint attribute names are unchanged. Fixes speechbrain#3056 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix LoRA adapter support for convolutional layers (fixes #3056)#3064

Fix LoRA adapter support for convolutional layers (fixes #3056)#3064
oliver0006 wants to merge 1 commit into
speechbrain:developfrom
oliver0006:fix/lora-conv-support

oliver0006 commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

oliver0006 commented Jul 3, 2026

What this does

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant