Fix GEGLU docstring: Sigmoid -> GELU#8696
Conversation
The docstring incorrectly stated GEGLU uses Sigmoid, but the code correctly uses GELU. Per the original paper (Shazeer, 2020): - GLU uses Sigmoid: GLU(x) = σ(xW) ⊗ xV - GEGLU uses GELU: GEGLU(x) = GELU(xW) ⊗ xV Reference: https://arxiv.org/abs/2002.05202 Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
monai/networks/blocks/activation.py (1)
167-184: LGTM! Docstring now matches implementation.The fix correctly aligns the documentation with the actual code (
nn.functional.gelu(gate)on line 184).Optional: For consistency with other activation classes in this file, consider adding an
Examplessection.📝 Optional: Add Examples section
Shape: - Input: :math:`(N, *, 2 * D)` - Output: :math:`(N, *, D)`, where `*` means, any number of additional dimensions + + + Examples:: + + >>> import torch + >>> m = GEGLU() + >>> input = torch.randn(2, 8) # last dim must be even + >>> output = m(input) """
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
monai/networks/blocks/activation.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/networks/blocks/activation.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: build-docs
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: packaging
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: flake8-py3 (codeformat)
- GitHub Check: flake8-py3 (pytype)
|
"Optional: For consistency with other activation classes in this file, consider adding an Examples section." -- Happy to add this as well if required. |
Add example showing usage of GEGLU to align docstring style with other activation classes. No functional changes. Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
805f33f to
f5a63ce
Compare
## Summary - Fixed GEGLU docstring which incorrectly stated the activation function was Sigmoid - The code correctly uses GELU, as specified in the original GEGLU paper ## Details - GLU uses Sigmoid: GLU(x) = σ(xW) ⊗ xV - GEGLU uses GELU: GEGLU(x) = GELU(xW) ⊗ xV Reference: https://arxiv.org/abs/2002.05202 --------- Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk> Co-authored-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>

Summary
Details
Reference: https://arxiv.org/abs/2002.05202