Releases: NVIDIA/NeMo
NVIDIA Neural Modules 2.2.0
Highlights
- Training
- Blackwell and Grace Blackwell support
- Pipeline parallel support for distillation
- Improved NeMo Framework installation
- Export & Deploy
- vLLM export for NeMo 2.0
- Evaluations
- Integrate lm-eval-harness
- Collections
- LLM
- DAPT Example and best practices in nemo 2.0
- [NeMo 2.0] Enable Tool Learning and add a tutorial
- Support GPT Embedding Model (Llama 3.2 1B/3B)
- Qwen2.5, Phi4 (via AutoModel)
- SFT for Llama 3.3 model (via AutoModel)
- Support BERT Embedding Model with NeMo 2.0
- DeepSeek SFT & PEFT Support
- MultiModal
- Clip
- SP for NeVA
- CP for NeVA
- Intern-VIT
- LLM
- Automodel
- Preview release.
- PEFT and SFT support for LLMs available via Hugging Face’s AutoModelForCausalLM.
- Support for Hugging Face-native checkpoints (full model and adapter only).
- Support for distributed training via DDP and FSDP2.
- ASR/TTS
- Lhotse: TPS-free 2D bucket estimation and filtering
- Update model outputs to make all asr outputs to be in consistent format
- Sortformer Release Model
Detailed Changelogs:
ASR
Changelog
- removed the line which caused a problem in nfa_tutorial by @Ssofja :: PR: #11710
- TPS-free 2D bucket estimation and filtering by @pzelasko :: PR: #11738
- Update transcribe_utils.py by @stevehuang52 :: PR: #11984
- Sortformer Diarizer 4spk v1 model PR Part 4: Sortformer Documents and Notebook Tutorials by @tango4j :: PR: #11707
- fix the issue during batched inference of Sortformer diarizer by @tango4j :: PR: #12047
- changed asr models outputs to be consistent by @Ssofja :: PR: #11818
- chore: Update notebooks by @ko3n1g :: PR: #12161
- add ctc segmentation by @ko3n1g :: PR: #12312
- clean up VAD tutorial by @stevehuang52 :: PR: #12410
- copy from main by @nithinraok :: PR: #12423
- ci: Disable ASR tests for now (#12443) by @ko3n1g :: PR: #12466
- ASR_CTC_Language_Finetuning.ipynb bugfix by @lilithgrigoryan :: PR: #12538
TTS
Changelog
NLP / NMT
Changelog
- Use explicit imports from megatronllm_deployable.py by @janekl :: PR: #11705
- Bug fix minor bug in TRT-LLM deployment by @oyilmaz-nvidia :: PR: #11714
- gpt moe perf scripts by @malay-nagda :: PR: #11760
- Bump mcore by @ko3n1g :: PR: #11740
- Enable packed seqs for validation by @jiemingz :: PR: #11748
- Revert Mcore update since it caused regression by @pablo-garay :: PR: #11791
- Fix Gemma2 Attention Init Args by @suiyoubi :: PR: #11792
- Add null tokenizer by @erhoo82 :: PR: #11789
- Fix DistCP inference issue by @suiyoubi :: PR: #11801
- Add BERT Embedding Models E5 Recipe by @suiyoubi :: PR: #11787
- Add rope scaling configs for NeMo 1 by @BoxiangW :: PR: #11807
- Fix calculating num_available_samples by @huvunvidia :: PR: #11830
- fix sentencepiece tokenizer special tokens by @akoumpa :: PR: #11811
- add chat sft dataset to support agent tool calling by @chenrui17 :: PR: #11759
- Revert "Revert Mcore update since it caused regression (#11791)" by @ko3n1g :: PR: #11799
- fix checkpoint load issue by @dimapihtar :: PR: #11859
- Fix nemo 1 packed sequence TE version error by @cuichenx :: PR: #11874
- enable loading older TE checkpoints by @dimapihtar :: PR: #11930
- ci: Use single runner machines for unit tests by @ko3n1g :: PR: #11937
- llm performance scripts by @malay-nagda :: PR: #11736
- [MoE] add expert tensor parallelism support for NeMo2.0 MoE by @gdengk :: PR: #11880
- add exception when loading ckpt saved by TE < 1.13 by @dimapihtar :: PR: #11988
- remove renormalize_blend_weights flag by @dimapihtar :: PR: #11975
- Llama3.2 1B Embedding Model Support by @suiyoubi :: PR: #11909
- Weekly bump by @ko3n1g :: PR: #11896
- Debug Apex distributed optimizer to handle Transformer Engine 2.0 by @timmoon10 :: PR: #12004
- throw MegatronOptimizerModule warning only with mcore models by @akoumpa :: PR: #12085
- fix nmt dataclass issue by @dimapihtar :: PR: #12081
- Propogate dp last changes from mcore by @ryantwolf :: PR: #12012
- Add error message when downloading failed. by @yuanzhedong :: PR: #12139
- interface for asymmetric pipeline schedule by @erhoo82 :: PR: #12039
- chore: Update notebooks by @ko3n1g :: PR: #12161
- Cherrypick #12382, #12415 and #12424 by @cuichenx :: PR: #12425
- ASR_CTC_Language_Finetuning.ipynb bugfix by @lilithgrigoryan :: PR: #12538
Text Normalization / Inverse Text Normalization
Changelog
NeMo Tools
Changelog
Export
Changelog
- Bug fix minor bug in TRT-LLM deployment by @oyilmaz-nvidia :: PR: #11714
- In-framework deployment NeMo 2.0 nemo_export.py test by @janekl :: PR: #11749
- Fix starcoder2 missing bias in nemo2 config for TRTLLM by @meatybobby :: PR: #11809
- Autodetect dtype on exporting to TensorRT-LLM by @janekl :: PR: #11907
- PTQ & TRT-LLM updates related to upcoming PyTorch 25.01 bump by @janekl :: PR: #11941
- Run Flake8 for nemo.export module by @janekl :: PR: #11728
- Skip initialization in hf export by @cuichenx :: PR: #12136
- update export io call by @akoumpa :: PR: #12144
- add default kwargs for trtllm model runner by @pablo-garay :: PR: #12248
- cherry-pick: fix[export]: reshard model correctly handles extra_state when it's a tensor (#12132) by @terrykong :: PR: #12335
Bugfixes
Changelog
Uncategorized:
Changelog
- Allow using vocab size from config by @shanmugamr1992 :: PR: #11718
- Fix baseline recipes by @erhoo82 :: PR: #11725
- Update changelog for
r2.1.0
by @github-actions[bot] :: PR: #11745 - ci: Fix changelog generator by @ko3n1g :: PR: #11744
- Fix 'http_port' parameter name in DeployPyTriton usages and update .qnemo compress=True path by @janekl :: PR: #11747
- Conversion NeMo and HF checkpoint script for T5 by @huvunvidia :: PR: #11739
- Add BERT Embedding Models by @suiyoubi :: PR: #11737
- Add server ready check before starting evaluation by @athitten :: PR: #11731
- only install bitsandbytes on x86 by @akoumpa :: PR: #11781
- [Bugfix] Skip processing if extra_state loads as None by @janekl :: PR: #11778
- chore(beep boop 🤖): Bump
MCORE_TAG=4dc8977...
(2025-01-07) by @ko3n1g :: PR: #11768 - make progress printer compatible with PTL v2.5.0 by @ashors1 :: PR: #11779
- Fix Mistral Conversion Issue by @suiyoubi :: PR: #11786
- build: Fix build-arg by @ko3n1g :: PR: #11815
- Lora ckpt in HF format for NeMo AutoModel by @oyilmaz-nvidia :: PR: #11712
- 8x22b seq len by @malay-nagda :: PR: #11788
- Bugfix for output_generation_logits in tensorrtllm by @athitten :: PR: #11820
- handle mistralai/Mistral-7B-Instruct-v0.3 tokenizer correctly by @akoumpa :: PR: #11839
- remove tensorstore pin in requirements*.txt by @pstjohn :: PR: #11777
- Do not load context for model transform in llm inference by @hemildesai :: PR: #11751
- update nemo2sftpeft tutorial container verison by @HuiyingLi :: PR: #11832
- Latest News updated for Cosmos by @lbliii :: PR: #11806
- Removes tensorstore 0.1.45 pin from requirements_deploy.txt by @pstjohn :: PR: #11858
- ci: Prune dangling images by @ko3n1g :: PR: #11885
- Disable tests that download datasets from web by @akoumpa :: PR: #11878
- Add context_logits for eval accuracy calculation in case of multi token prediction tasks by @athitten :: PR: #11753
- add dataset_root to SpecterDataModule by @suiyoubi :: PR: #11837
- Support both Path and str for APIs by @maanug-nv :: PR: #11865
- Run nsys callback on GBS not on MBS by @akoumpa :: PR: #11861
- ci: Set bump-branch to weekly by @ko3n1g :: PR: #11889
- chore: Update mcore-tag-bump-bot.yml by @ko3n1g :: PR: #11891
- ci: Bump Mcore in weekly PR by @ko3n1g :: PR: #11897
- check restore_config first by @akoumpa :: PR: #11890
- LinearAdapter: propagate args to _init_adapter by @akoumpa :: PR: #11902
- NeMo 2.0 fp8 conversion by @Laplasjan107 :: PR: #11845
- nemo ux expert tensor parallel by @akoumpa :: PR: #11903
- Add CP support to Neva in NeMo2 by @yaoyu-33 :: PR: #11850
- build: Move dependencies by @ko3n1g :: PR: #11790
- Add Flux and Flux Controlnet Support to Diffusion folder by @Victor49152 :: PR: #11794
- ci: Adjust bump mcore workflow by @ko3n1g :: PR: #11918
- ci: Small fix to bump workflow by @ko3n1g :: PR: #11919
- Revert #11890 and add a test that would have caught the error by @cuichenx :: PR: #11914
- ci: Adjust input argument by @ko3n1g :: PR: #11921
- Create test_phi3.py by @mayani-nv :: PR: #11843
- Enable NeMo importer and loading dist CKPT for training by @Victor49152 :: PR: #11927
- build: Pin
triton
by @ko3n1g :: PR: #11938 - Add sharding for speechlm and vlm by @BoxiangW :: PR: #11876
- Update torch load for load from disk by @thomasdhc :: PR: #11963
- Add options to add mp_policy and parallel_fn for NeMo automodel fsdp2 by @BoxiangW :: PR: #11956
- ci: Add coverage reports by @ko3n1g :: PR: #11912
- Add batching support for evaluation by @athitten :: PR: #11934
- add use_fast option by @akoumpa :: PR: #11976
- improve error and debug messages in model connector by @cuichenx :: PR: #11979
- [checkpoint][docs] Fix typos in dist checkpointing docs by @ananthsub :: PR: #1...
NVIDIA Neural Modules 2.2.0rc3
Prerelease: NVIDIA Neural Modules 2.2.0rc3 (2025-02-25)
NVIDIA Neural Modules 2.2.0rc2
Prerelease: NVIDIA Neural Modules 2.2.0rc2 (2025-02-17)
NVIDIA Neural Modules 2.2.0rc1
Prerelease: NVIDIA Neural Modules 2.2.0rc1 (2025-02-04)
NVIDIA Neural Modules 2.2.0rc0
Prerelease: NVIDIA Neural Modules 2.2.0rc0 (2025-02-02)
NVIDIA Neural Modules 2.1.0
Highlights
- Training
- Fault Tolerance
- Straggler Detection
- Auto Relaunch
- Fault Tolerance
- LLM & MM
- MM models
- Llava-next
- Llama 3.2
- Sequence Model Parallel for NeVa
- Enable Energon
- SigLIP (NeMo 1.0 only)
- LLM 2.0 migration
- Starcoder2
- Gemma 2
- T5
- Baichuan
- BERT
- Mamba
- ChatGLM
- DoRA support
- MM models
- Export
- Nemo 2.0 base model export path for NIM
- PTQ in Nemo 2.0
- ASR
- Timestamps with TDT decoder
- Timestamps option with .transcribe()
Detailed Changelogs:
ASR
Changelog
- [Fix] Fixed sampler override and audio_key in prepare_audio_data by @anteju :: PR: #10980
- Akoumparouli/mixtral recipe fix r2.0.0 by @akoumpa :: PR: #10994
- TDT compute timestamps option and Extra Whitespace handling for SPE by @monica-sekoyan :: PR: #10875
- ci: Switch to CPU only runner by @ko3n1g :: PR: #11035
- Fix timestamps tests by @monica-sekoyan :: PR: #11053
- ci: Pin release freeze by @ko3n1g :: PR: #11143
- Fix RNN-T loss memory usage by @artbataev :: PR: #11144
- Added deprecation notice by @Ssofja :: PR: #11133
- Fixes for Canary adapters tutorial by @pzelasko :: PR: #11184
- add ipython import guard by @nithinraok :: PR: #11191
- Self Supervised Pre-Training tutorial Fix by @monica-sekoyan :: PR: #11206
- update the return type by @nithinraok :: PR: #11210
- Timestamps to transcribe by @nithinraok :: PR: #10950
- [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
- Beam search algorithm implementation for TDT models by @lilithgrigoryan :: PR: #10903
- Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
- Remove pytorch-lightning by @maanug-nv :: PR: #11306
- update hypothesis when passed through cfg by @nithinraok :: PR: #11366
- Revert "update hypothesis when passed through cfg" by @pablo-garay :: PR: #11373
- Fix transcribe speech by @nithinraok :: PR: #11379
- Lhotse support for transcribe_speech_parallel by @nune-tadevosyan :: PR: #11249
- Sortformer Diarizer 4spk v1 model PR Part 1: models, modules and dataloaders by @tango4j :: PR: #11282
- Removing unnecessary lines by @nune-tadevosyan :: PR: #11408
- Support for initializing lhotse shar dataloader via field: list[path] mapping by @pzelasko :: PR: #11460
- New extended prompt format for Canary, short utterances inference fix, and training micro-optimizations by @pzelasko :: PR: #11058
- Fixing Multi_Task_Adapters.ipynb by replacing canary2 with canary_custom by @weiqingw4ng :: PR: #11636
TTS
Changelog
- [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
- Add T5TTS by @blisc :: PR: #11193
- Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
- Remove pytorch-lightning by @maanug-nv :: PR: #11306
- Add nvidia/low-frame-rate-speech-codec-22khz model on docs by @Edresson :: PR: #11457
NLP / NMT
Changelog
- Move collectiob.nlp imports inline for t5 by @marcromeyn :: PR: #10877
- Use a context-manager when opening files by @akoumpa :: PR: #10895
- Packed sequence bug fixes by @cuichenx :: PR: #10898
- ckpt convert bug fixes by @dimapihtar :: PR: #10878
- remove deprecated ci tests by @dimapihtar :: PR: #10922
- Update T5 tokenizer (adding additional tokens to tokenizer config) by @huvunvidia :: PR: #10972
- Add support and recipes for HF models via AutoModelForCausalLM by @akoumpa :: PR: #10962
- gpt3 175b cli by @malay-nagda :: PR: #10985
- Fix for crash with LoRA + tp_overlap_comm=false + sequence_parallel=true by @vysarge :: PR: #10920
- Update
BaseMegatronSampler
for compatibility with PTL's_BatchProgress
by @ashors1 :: PR: #11016 - add deprecation note by @dimapihtar :: PR: #11024
- Update ModelOpt Width Pruning example defaults by @kevalmorabia97 :: PR: #10902
- switch to NeMo 2.0 recipes by @dimapihtar :: PR: #10948
- NeMo 1.0: upcycle dense to moe by @akoumpa :: PR: #11002
- Gemma2 in Nemo2 with Recipes by @suiyoubi :: PR: #11037
- Add Packed Seq option to GPT based models by @suiyoubi :: PR: #11100
- Fix MCoreGPTModel import in llm.gpt.model.base by @hemildesai :: PR: #11109
- TP+MoE peft fix by @akoumpa :: PR: #11114
- GPT recipes to use full te spec by @JimmyZhang12 :: PR: #11119
- Virtual pipeline parallel support for LoRA in NLPAdapterModelMixin by @vysarge :: PR: #11128
- update nemo args for mcore flash decode arg change by @HuiyingLi :: PR: #11138
- Call
ckpt_to_weights_subdir
fromMegatronCheckpointIO
by @ashors1 :: PR: #10897 - [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
- fix(export): GPT models w/ bias=False convert properly by @terrykong :: PR: #11255
- Use MegatronDataSampler in HfDatasetDataModule by @akoumpa :: PR: #11274
- Add T5TTS by @blisc :: PR: #11193
- ci: Exclude CPU machines from scan by @ko3n1g :: PR: #11300
- Revert "fix(export): GPT models w/ bias=False convert properly" by @terrykong :: PR: #11301
- remove redundant docs by @sharathts :: PR: #11302
- Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
- Add
attention_bias
argument in transformer block and transformer layer modules, addressing change in MCore by @yaoyu-33 :: PR: #11289 - Remove pytorch-lightning by @maanug-nv :: PR: #11306
- Update T5 attention-mask shapes to be compatible with all attention-backend in new TE versions by @huvunvidia :: PR: #11059
- Add support for restoring from 2.0 checkpoint in 1.0 by @hemildesai :: PR: #11347
- Fix Gemma2 Attention Args by @suiyoubi :: PR: #11365
- mlm conversion & tiktokenizer support by @dimapihtar :: PR: #11349
- [Nemo1] Generate sharded optimizer state dicts only if needed for saving by @ananthsub :: PR: #11451
- add hindi tn/itn coverage by @mgrafu :: PR: #11382
- chore(beep boop 🤖): Bump
MCORE_TAG=67a50f2...
(2024-11-28) by @ko3n1g :: PR: #11427 - Handle exception when importing RetroGPTChunkDatasets by @guyueh1 :: PR: #11415
- Update restore from config for gpt type continual training in NeMo1 by @yaoyu-33 :: PR: #11471
- ci: Re-enable
L2_Megatron_LM_To_NeMo_Conversion
by @ko3n1g :: PR: #11484 - Apply packed sequence params change for fused rope compatibility by @ananthsub :: PR: #11506
- Huvu/tiktoken tokenizer update by @huvunvidia :: PR: #11494
Text Normalization / Inverse Text Normalization
Changelog
- Adding support for LightningDataModule inside Fabric-API by @marcromeyn :: PR: #10879
- Add registry to register all needed classes with artifacts in nemo.lightning.io by @hemildesai :: PR: #10861
- Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
- Remove pytorch-lightning by @maanug-nv :: PR: #11306
- add hindi tn/itn coverage by @mgrafu :: PR: #11382
Export
Changelog
- Update engine build step for TRT-LLM 0.13.0 by @janekl :: PR: #10880
- Nemo 2.0 ckpt support in TRT-LLM export by @oyilmaz-nvidia :: PR: #10891
- Fix TRTLLM parallel_embedding by @meatybobby :: PR: #10975
- Export & deploy updates (part I) by @janekl :: PR: #10941
- Add doc-strings to import & export + improve logging by @marcromeyn :: PR: #11078
- NeMo-UX: fix nemo-ux export path by @akoumpa :: PR: #11081
- Fix TRTLLM nemo2 activation parsing by @meatybobby :: PR: #11062
- Support exporting Nemotron-340B for TensorRT-LLM by @jinyangyuan-nvidia :: PR: #11015
- vLLM Hugging Face exporter by @oyilmaz-nvidia :: PR: #11124
- Fix export of configuration parameters to Weights and Biases by @soluwalana :: PR: #10995
- Change activation parsing in TRTLLM by @meatybobby :: PR: #11173
- Remove builder_opt param from trtllm-build for TensorRT-LLM >= 0.14.0 by @janekl :: PR: #11259
- fix(export): GPT models w/ bias=False convert properly by @terrykong :: PR: #11255
- fix(export): update API for disabling device reassignment in TRTLLM for Aligner by @terrykong :: PR: #10863
- Add openai-gelu in gated activation for TRTLLM export by @meatybobby :: PR: #11293
- Revert "fix(export): GPT models w/ bias=False convert properly" by @terrykong :: PR: #11301
- Adding alinger export by @shanmugamr1992 :: PR: #11269
- Export & deploy updates (part II) by @janekl :: PR: #11344
- Introducing TensorRT lazy export and caching option with trt_compile() by @borisfom :: PR: #11266
- fix: export converts properly if no model_prefix by @terrykong :: PR: #11477
Bugfixes
Changelog
- Change default ckpt name by @maanug-nv :: PR: #11277
- Fix patching of NeMo tokenizers for correct Lambada evaluation by @janekl :: PR: #11326
Uncategorized:
Changelog
- ci: Use Slack group by @ko3n1g :: PR: #10866
- Bump
Dockerfile.ci
(2024-10-14) by @ko3n1g :: PR: #10871 - Fix peft resume by @cuichenx :: PR: #10887
- call post_init after altering config values by @akoumpa :: PR: #10885
- Late import prettytable by @maanug-nv :: PR: #10912
- Bump
Dockerfile.ci
(2024-10-17) by @ko3n1g :: PR: #10919 - Warning for missing FP8 checkpoint support for vLLM deployment by @janekl :: PR: #10906
- Fix artifact saving by @hemildesai :: PR: #10914
- Lora improvement by @cuichenx :: PR: #10918
- Huvu/t5 nemo2.0 peft by @huvunvidia :: PR: #10916
- perf recipes and Mcore DistOpt params by @malay-nagda :: PR: #10883
- ci: Fix cherry pick team by @ko3n1g :: PR: #10945
- Fix requirements for MacOS by @artbataev :: PR: #10930
- Fix nemo 2.0 recipes by @BoxiangW :: PR: #10915
- Akoumparouli/nemo ux fix dir or string artifact by @akoumpa :: PR: #10936
- Fix typo in docstring by @ashors1 :: PR: #10955
- [Nemo CICD] Remove deprecated tests by @pablo-garay :: PR: #10960
- Restore NeMo 2.0 T5 pretraining CICD test by @huvunvidia :: PR: #10952
- Convert perf plugin env vars to strings by @hemildesai :: PR: #10947
- disable ...
NVIDIA Neural Modules 2.1.0rc2
Prerelease: NVIDIA Neural Modules 2.1.0rc2 (2024-12-21)
NVIDIA Neural Modules 2.1.0rc1
Prerelease: NVIDIA Neural Modules 2.1.0rc1 (2024-12-20)
NVIDIA Neural Modules 2.1.0rc0
[🤠]: Howdy folks, let's release NeMo `r2.1.0` ! (#11556) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
NVIDIA Neural Modules 2.0.0
Highlights
Large language models & Multi modal
- Training
- Long context recipe
- PyTorch Native FSDP 1
- Models
- Llama 3
- Mixtral
- Nemotron
- NeMo 1.0
Export
- TensorRT-LLM v0.12 integration
- LoRA support for vLLM
- FP8 checkpoint
ASR
- Parakeet large (ASR with PnC model)
- Added Uzbek offline and Gregorian streaming models
- Optimization feature for efficient bucketing to improve bs consumption on GPUs
Detailed Changelogs
ASR
Changelog
- add parakeet-tdt_ctc-110m model by @nithinraok :: PR: #10461
- fix asr finetune by @stevehuang52 :: PR: #10508
- replace unbiased with correction by @nithinraok :: PR: #10555
- Update Multi_Task_Adapters.ipynb by @pzelasko :: PR: #10600
- Fix asr warnings by @nithinraok :: PR: #10469
- Fix typo in ASR RNNT BPE model by @pzelasko :: PR: #10742
- TestEncDecMultiTaskModel for canary parallel by @karpnv :: PR: #10740
- fix chunked infer by @stevehuang52 :: PR: #10581
- training code for hybrid-autoregressive inference model by @hainan-xv :: PR: #10841
- remove stacking operation from batched functions by @lilithgrigoryan :: PR: #10524
- Add lhotse fixes for rnnt model training and WER hanging issue with f… by @nithinraok :: PR: #10821
- Fix ASR tests by @artbataev :: PR: #10794
- [Fix] Fixed sampler override and audio_key in prepare_audio_data by @anteju :: PR: #10980
- [WIP] Add docs for NEST SSL by @stevehuang52 :: PR: #10804
- Akoumparouli/mixtral recipe fix r2.0.0 by @akoumpa :: PR: #10994
- TDT compute timestamps option and Extra Whitespace handling for SPE by @monica-sekoyan :: PR: #10875
- ci: Switch to CPU only runner by @ko3n1g :: PR: #11035
- Fix timestamps tests by @monica-sekoyan :: PR: #11053
- ci: Pin release freeze by @ko3n1g :: PR: #11143
- Fix RNN-T loss memory usage by @artbataev :: PR: #11144
- Added deprecation notice by @Ssofja :: PR: #11133
- Fixes for Canary adapters tutorial by @pzelasko :: PR: #11184
- add ipython import guard by @nithinraok :: PR: #11191
- Self Supervised Pre-Training tutorial Fix by @monica-sekoyan :: PR: #11206
- update the return type by @nithinraok :: PR: #11210
- Timestamps to transcribe by @nithinraok :: PR: #10950
- [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
- Beam search algorithm implementation for TDT models by @lilithgrigoryan :: PR: #10903
TTS
Changelog
- Fix asr warnings by @nithinraok :: PR: #10469
- Make nemo text processing optional in TTS by @blisc :: PR: #10584
- [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
NLP / NMT
Changelog
-
MCORE interface for TP-only FP8 AMAX reduction by @erhoo82 :: PR: #10437
-
Remove Apex dependency if not using MixedFusedLayerNorm by @cuichenx :: PR: #10468
-
Add missing import guards for causal_conv1d and mamba_ssm dependencies by @janekl :: PR: #10429
-
Update doc for fp8 trt-llm export by @Laplasjan107 :: PR: #10444
-
Remove running validating after finetuning by @huvunvidia :: PR: #10560
-
Extending modelopt spec for TEDotProductAttention by @janekl :: PR: #10523
-
Fix mb_calculator import in lora tutorial by @BoxiangW :: PR: #10624
-
.nemo conversion bug fix by @dimapihtar :: PR: #10598
-
Require setuptools>=70 and update deprecated api by @thomasdhc :: PR: #10659
-
Akoumparouli/fix get tokenizer list by @akoumpa :: PR: #10596
-
[McoreDistOptim] fix the naming to match apex.dist by @gdengk :: PR: #10707
-
[fix] Ensures disabling exp_manager with exp_manager=null does not error by @terrykong :: PR: #10651
-
[feat] Update get_model_parallel_src_rank to support tp-pp-dp ordering by @terrykong :: PR: #10652
-
feat: Migrate GPTSession refit path in Nemo export to ModelRunner for Aligner by @terrykong :: PR: #10654
-
[MCoreDistOptim] Add assertions for McoreDistOptim and fix fp8 arg specs by @gdengk :: PR: #10748
-
Fix for crashes with tensorboard_logger=false and VP + LoRA by @vysarge :: PR: #10792
-
Adding init_model_parallel to FabricMegatronStrategy by @marcromeyn :: PR: #10733
-
Moving steps to MegatronParallel to improve UX for Fabric by @marcromeyn :: PR: #10732
-
Adding setup_megatron_optimizer to FabricMegatronStrategy by @marcromeyn :: PR: #10833
-
Make FabricMegatronMixedPrecision match MegatronMixedPrecision by @marcromeyn :: PR: #10835
-
Fix VPP bug in MegatronStep by @marcromeyn :: PR: #10847
-
Expose drop_last in MegatronDataSampler by @farhadrgh :: PR: #10837
-
Move collectiob.nlp imports inline for t5 by @marcromeyn :: PR: #10877
-
Use a context-manager when opening files by @akoumpa :: PR: #10895
-
ckpt convert bug fixes by @dimapihtar :: PR: #10878
-
remove deprecated ci tests by @dimapihtar :: PR: #10922
-
Update T5 tokenizer (adding additional tokens to tokenizer config) by @huvunvidia :: PR: #10972
-
Add support and recipes for HF models via AutoModelForCausalLM by @akoumpa :: PR: #10962
- gpt3 175b cli by @malay-nagda :: PR: #10985
- Fix for crash with LoRA + tp_overlap_comm=false + sequence_parallel=true by @vysarge :: PR: #10920
- Update
BaseMegatronSampler
for compatibility with PTL'''s_BatchProgress
by @ashors1 :: PR: #11016 - add deprecation note by @dimapihtar :: PR: #11024
- Update ModelOpt Width Pruning example defaults by @kevalmorabia97 :: PR: #10902
- switch to NeMo 2.0 recipes by @dimapihtar :: PR: #10948
- NeMo 1.0: upcycle dense to moe by @akoumpa :: PR: #11002
- Update mcore parallelism initialization in nemo2 by @yaoyu-33 :: PR: #10643
- Gemma2 in Nemo2 with Recipes by @suiyoubi :: PR: #11037
- Add Packed Seq option to GPT based models by @suiyoubi :: PR: #11100
- Fix MCoreGPTModel import in llm.gpt.model.base by @hemildesai :: PR: #11109
- TP+MoE peft fix by @akoumpa :: PR: #11114
- GPT recipes to use full te spec by @JimmyZhang12 :: PR: #11119
- Virtual pipeline parallel support for LoRA in NLPAdapterModelMixin by @vysarge :: PR: #11128
- update nemo args for mcore flash decode arg change by @HuiyingLi :: PR: #11138
- Call
ckpt_to_weights_subdir
fromMegatronCheckpointIO
by @ashors1 :: PR: #10897 - fix typo by @dimapihtar :: PR: #11234
- [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
- fix(export): GPT models w/ bias=False convert properly by @terrykong :: PR: #11255