NeMo Framework Megatron Backend | NVIDIA NGC
NVIDIA
NVIDIA
NeMo Framework Megatron Backend
Container
NVIDIA
NVIDIA
NeMo Framework Megatron Backend

NVIDIA NeMo™ framework Megatron backend supports pre-training, post-training, and reinforcement learning of LLMs and multi-modal generative AI models with state-of-the-art data processing, model training techniques, and flexible deployment options.

What is the NeMo Framework Megatron Backend Container?

NVIDIA NeMo™ is a comprehensive software suite to build, monitor, and optimize AI agents across their lifecycle at enterprise scale.

NVIDIA NeMo framework, part of NVIDIA NeMo suite, is an end-to-end platform for pre-training, post-training, and reinforcement learning of multi-modal generative AI models across cloud, on-prem, and hybrid setups. NeMo framework is designed to enable experimentation and research with flexibility, and production-grade training with scale and speed. It utilizes state-of-the-art NVIDIA technologies to facilitate a complete workflow from distributed data processing to training of large-scale bespoke models using sophisticated 4D parallelism techniques, to deployment on an infrastructure of your choice, be it on-premises or in the cloud.

For enterprises running their business on AI, NVIDIA AI Enterprise provides a production-grade, secure, end-to-end software platform that includes NeMo as well as generative AI reference applications and enterprise support to streamline adoption. Now organizations can integrate AI into their operations, streamlining processes, enhancing decision-making capabilities, and ultimately driving greater value.

What You Get with NVIDIA NeMo Framework Megatron Backend Container

At the heart of the NeMo framework lies the unification of distributed training and advanced parallelism. NeMo expertly uses GPU resources and memory across nodes, leading to groundbreaking efficiency gains. By dividing the model and training data, NeMo enables seamless multi-node and multi-GPU training, significantly reducing training time and enhancing overall productivity. A standout feature of NeMo is its incorporation of various parallelism and memory saving techniques:

Parallelism Techniques

  • Data Parallelism
  • Fully Sharded Data Parallelism (FSDP)
  • Tensor Parallelism
  • Pipeline Parallelism
  • Sequence Parallelism
  • Expert Parallelism
  • Context Parallelism

Memory-Saving Techniques

  • Selective Activation Recompute (SAR)
  • CPU offloading (Activation, Weights)
  • Attention: Flash Attention (FA), Grouped Query Attention (GQA), Multi-Query Attention (MQA), Sliding Window Attention (SWA)

NeMo framework container is the leading solution to support multimodality training at scale. The platform supports language and multimodal models including Llama 2, Falcon, and CLIP, Stable Diffusion, LLAVA, and various text-based generative AI architectures including GPT, T5, BERT, MoE, RETRO. In addition to large language models (LLM), NeMo supports several pretrained models for Computer Vision (CV), Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text to Speech (TTS).

The NeMo framework container offers an array of techniques to refine pretrained LLMs for specialized use cases including p-tuning, LoRA, Supervised fine-tuning (SFT), Reinforcement learning from human feedback (RLHF), SteerLM, and more. Through these diverse customization options, NeMo offers wide-ranging flexibility that is crucial in meeting varying business requirements.

Getting Started With NVIDIA NeMo

Refer to the NVIDIA NeMo playbooks page for step-by-step instructions on how to get started quickly with the NeMo framework.

Questions? See the current discussions and submit a question.

Report a bug? You can report a Dev container bug.

Technical Blogs

Documentation

More detailed documentation is available on the Nemo framework.

Known Issues

  • The 25.07.01 container has an incorrrect version of nvidia-lm-eval that breaks evaluation. This will be addressed in an upcoming patch. The 25.07 container was previously re-tagged to refer to 25.07.01, but this has since been changed back to refer to 25.07.00. Please redownload 25.07 or 25.07.00 if you encountered evaluation errors.

License

Governing Terms: Your use of the NeMo Framework is governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.

This container contains Llama Materials governed by the Meta Llama3 Community License Agreement, and is Built with Meta Llama3.

Get Help

Enterprise Support

Get access to knowledge base articles and support cases or submit a ticket.

Publisher
NVIDIA
NVIDIA
Latest Tag26.06
UpdatedJune 22, 2026 UTC
Compressed Size16.99 GB
Multinode SupportYes
Multi-Arch SupportYes

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.