Releases · NVIDIA/NeMo

12 Mar 20:30

ko3n1g

v2.2.0

7192a2c

NVIDIA Neural Modules 2.2.0 Latest

Latest

Highlights

Training
- Blackwell and Grace Blackwell support
- Pipeline parallel support for distillation
- Improved NeMo Framework installation
Export & Deploy
- vLLM export for NeMo 2.0
Evaluations
- Integrate lm-eval-harness
Collections
- LLM
  - DAPT Example and best practices in nemo 2.0
  - [NeMo 2.0] Enable Tool Learning and add a tutorial
  - Support GPT Embedding Model (Llama 3.2 1B/3B)
  - Qwen2.5, Phi4 (via AutoModel)
  - SFT for Llama 3.3 model (via AutoModel)
  - Support BERT Embedding Model with NeMo 2.0
  - DeepSeek SFT & PEFT Support
- MultiModal
  - Clip
  - SP for NeVA
  - CP for NeVA
  - Intern-VIT
Automodel
- Preview release.
- PEFT and SFT support for LLMs available via Hugging Face’s AutoModelForCausalLM.
- Support for Hugging Face-native checkpoints (full model and adapter only).
- Support for distributed training via DDP and FSDP2.
ASR/TTS
- Lhotse: TPS-free 2D bucket estimation and filtering
- Update model outputs to make all asr outputs to be in consistent format
- Sortformer Release Model

Detailed Changelogs:

ASR

Changelog

removed the line which caused a problem in nfa_tutorial by @Ssofja :: PR: #11710
TPS-free 2D bucket estimation and filtering by @pzelasko :: PR: #11738
Update transcribe_utils.py by @stevehuang52 :: PR: #11984
Sortformer Diarizer 4spk v1 model PR Part 4: Sortformer Documents and Notebook Tutorials by @tango4j :: PR: #11707
fix the issue during batched inference of Sortformer diarizer by @tango4j :: PR: #12047
changed asr models outputs to be consistent by @Ssofja :: PR: #11818
chore: Update notebooks by @ko3n1g :: PR: #12161
add ctc segmentation by @ko3n1g :: PR: #12312
clean up VAD tutorial by @stevehuang52 :: PR: #12410
copy from main by @nithinraok :: PR: #12423
ci: Disable ASR tests for now (#12443) by @ko3n1g :: PR: #12466
ASR_CTC_Language_Finetuning.ipynb bugfix by @lilithgrigoryan :: PR: #12538

TTS

Changelog

Add New Transformer Backbone for TTS Models by @blisc :: PR: #11911
changed asr models outputs to be consistent by @Ssofja :: PR: #11818
chore: Update notebooks by @ko3n1g :: PR: #12161

NLP / NMT

Changelog

Use explicit imports from megatronllm_deployable.py by @janekl :: PR: #11705
Bug fix minor bug in TRT-LLM deployment by @oyilmaz-nvidia :: PR: #11714
gpt moe perf scripts by @malay-nagda :: PR: #11760
Bump mcore by @ko3n1g :: PR: #11740
Enable packed seqs for validation by @jiemingz :: PR: #11748
Revert Mcore update since it caused regression by @pablo-garay :: PR: #11791
Fix Gemma2 Attention Init Args by @suiyoubi :: PR: #11792
Add null tokenizer by @erhoo82 :: PR: #11789
Fix DistCP inference issue by @suiyoubi :: PR: #11801
Add BERT Embedding Models E5 Recipe by @suiyoubi :: PR: #11787
Add rope scaling configs for NeMo 1 by @BoxiangW :: PR: #11807
Fix calculating num_available_samples by @huvunvidia :: PR: #11830
fix sentencepiece tokenizer special tokens by @akoumpa :: PR: #11811
add chat sft dataset to support agent tool calling by @chenrui17 :: PR: #11759
Revert "Revert Mcore update since it caused regression (#11791)" by @ko3n1g :: PR: #11799
fix checkpoint load issue by @dimapihtar :: PR: #11859
Fix nemo 1 packed sequence TE version error by @cuichenx :: PR: #11874
enable loading older TE checkpoints by @dimapihtar :: PR: #11930
ci: Use single runner machines for unit tests by @ko3n1g :: PR: #11937
llm performance scripts by @malay-nagda :: PR: #11736
[MoE] add expert tensor parallelism support for NeMo2.0 MoE by @gdengk :: PR: #11880
add exception when loading ckpt saved by TE < 1.13 by @dimapihtar :: PR: #11988
remove renormalize_blend_weights flag by @dimapihtar :: PR: #11975
Llama3.2 1B Embedding Model Support by @suiyoubi :: PR: #11909
Weekly bump by @ko3n1g :: PR: #11896
Debug Apex distributed optimizer to handle Transformer Engine 2.0 by @timmoon10 :: PR: #12004
throw MegatronOptimizerModule warning only with mcore models by @akoumpa :: PR: #12085
fix nmt dataclass issue by @dimapihtar :: PR: #12081
Propogate dp last changes from mcore by @ryantwolf :: PR: #12012
Add error message when downloading failed. by @yuanzhedong :: PR: #12139
interface for asymmetric pipeline schedule by @erhoo82 :: PR: #12039
chore: Update notebooks by @ko3n1g :: PR: #12161
Cherrypick #12382, #12415 and #12424 by @cuichenx :: PR: #12425
ASR_CTC_Language_Finetuning.ipynb bugfix by @lilithgrigoryan :: PR: #12538

Text Normalization / Inverse Text Normalization

Changelog

surface attn_implementation option by @akoumpa :: PR: #11873
attn_implementation eager fallback by @akoumpa :: PR: #12060

NeMo Tools

Changelog

build: Add sox to SDE by @ko3n1g :: PR: #11882
add ctc segmentation by @ko3n1g :: PR: #12312

Export

Changelog

Bug fix minor bug in TRT-LLM deployment by @oyilmaz-nvidia :: PR: #11714
In-framework deployment NeMo 2.0 nemo_export.py test by @janekl :: PR: #11749
Fix starcoder2 missing bias in nemo2 config for TRTLLM by @meatybobby :: PR: #11809
Autodetect dtype on exporting to TensorRT-LLM by @janekl :: PR: #11907
PTQ & TRT-LLM updates related to upcoming PyTorch 25.01 bump by @janekl :: PR: #11941
Run Flake8 for nemo.export module by @janekl :: PR: #11728
Skip initialization in hf export by @cuichenx :: PR: #12136
update export io call by @akoumpa :: PR: #12144
add default kwargs for trtllm model runner by @pablo-garay :: PR: #12248
cherry-pick: fix[export]: reshard model correctly handles extra_state when it's a tensor (#12132) by @terrykong :: PR: #12335

Bugfixes

Changelog

added required instalation for sox to process mp3 file by @Ssofja :: PR: #11709
removed the line which caused a problem in nfa_tutorial by @Ssofja :: PR: #11710
Bug fix minor bug in TRT-LLM deployment by @oyilmaz-nvidia :: PR: #11714

Uncategorized:

Changelog

Allow using vocab size from config by @shanmugamr1992 :: PR: #11718
Fix baseline recipes by @erhoo82 :: PR: #11725
Update changelog for r2.1.0 by @github-actions[bot] :: PR: #11745
ci: Fix changelog generator by @ko3n1g :: PR: #11744
Fix 'http_port' parameter name in DeployPyTriton usages and update .qnemo compress=True path by @janekl :: PR: #11747
Conversion NeMo and HF checkpoint script for T5 by @huvunvidia :: PR: #11739
Add BERT Embedding Models by @suiyoubi :: PR: #11737
Add server ready check before starting evaluation by @athitten :: PR: #11731
only install bitsandbytes on x86 by @akoumpa :: PR: #11781
[Bugfix] Skip processing if extra_state loads as None by @janekl :: PR: #11778
chore(beep boop 🤖): Bump MCORE_TAG=4dc8977... (2025-01-07) by @ko3n1g :: PR: #11768
make progress printer compatible with PTL v2.5.0 by @ashors1 :: PR: #11779
Fix Mistral Conversion Issue by @suiyoubi :: PR: #11786
build: Fix build-arg by @ko3n1g :: PR: #11815
Lora ckpt in HF format for NeMo AutoModel by @oyilmaz-nvidia :: PR: #11712
8x22b seq len by @malay-nagda :: PR: #11788
Bugfix for output_generation_logits in tensorrtllm by @athitten :: PR: #11820
handle mistralai/Mistral-7B-Instruct-v0.3 tokenizer correctly by @akoumpa :: PR: #11839
remove tensorstore pin in requirements*.txt by @pstjohn :: PR: #11777
Do not load context for model transform in llm inference by @hemildesai :: PR: #11751
update nemo2sftpeft tutorial container verison by @HuiyingLi :: PR: #11832
Latest News updated for Cosmos by @lbliii :: PR: #11806
Removes tensorstore 0.1.45 pin from requirements_deploy.txt by @pstjohn :: PR: #11858
ci: Prune dangling images by @ko3n1g :: PR: #11885
Disable tests that download datasets from web by @akoumpa :: PR: #11878
Add context_logits for eval accuracy calculation in case of multi token prediction tasks by @athitten :: PR: #11753
add dataset_root to SpecterDataModule by @suiyoubi :: PR: #11837
Support both Path and str for APIs by @maanug-nv :: PR: #11865
Run nsys callback on GBS not on MBS by @akoumpa :: PR: #11861
ci: Set bump-branch to weekly by @ko3n1g :: PR: #11889
chore: Update mcore-tag-bump-bot.yml by @ko3n1g :: PR: #11891
ci: Bump Mcore in weekly PR by @ko3n1g :: PR: #11897
check restore_config first by @akoumpa :: PR: #11890
LinearAdapter: propagate args to _init_adapter by @akoumpa :: PR: #11902
NeMo 2.0 fp8 conversion by @Laplasjan107 :: PR: #11845
nemo ux expert tensor parallel by @akoumpa :: PR: #11903
Add CP support to Neva in NeMo2 by @yaoyu-33 :: PR: #11850
build: Move dependencies by @ko3n1g :: PR: #11790
Add Flux and Flux Controlnet Support to Diffusion folder by @Victor49152 :: PR: #11794
ci: Adjust bump mcore workflow by @ko3n1g :: PR: #11918
ci: Small fix to bump workflow by @ko3n1g :: PR: #11919
Revert #11890 and add a test that would have caught the error by @cuichenx :: PR: #11914
ci: Adjust input argument by @ko3n1g :: PR: #11921
Create test_phi3.py by @mayani-nv :: PR: #11843
Enable NeMo importer and loading dist CKPT for training by @Victor49152 :: PR: #11927
build: Pin triton by @ko3n1g :: PR: #11938
Add sharding for speechlm and vlm by @BoxiangW :: PR: #11876
Update torch load for load from disk by @thomasdhc :: PR: #11963
Add options to add mp_policy and parallel_fn for NeMo automodel fsdp2 by @BoxiangW :: PR: #11956
ci: Add coverage reports by @ko3n1g :: PR: #11912
Add batching support for evaluation by @athitten :: PR: #11934
add use_fast option by @akoumpa :: PR: #11976
improve error and debug messages in model connector by @cuichenx :: PR: #11979
[checkpoint][docs] Fix typos in dist checkpointing docs by @ananthsub :: PR: #1...

Contributors

Glorf, soluwalana, and 52 other contributors

Assets 2

25 Feb 12:47

ko3n1g

v2.2.0rc3

b21e079

NVIDIA Neural Modules 2.2.0rc3 Pre-release

Pre-release

Prerelease: NVIDIA Neural Modules 2.2.0rc3 (2025-02-25)

Assets 2

17 Feb 17:04

ko3n1g

v2.2.0rc2

798b676

NVIDIA Neural Modules 2.2.0rc2 Pre-release

Pre-release

Prerelease: NVIDIA Neural Modules 2.2.0rc2 (2025-02-17)

Assets 2

04 Feb 08:02

ko3n1g

v2.2.0rc1

18e2bd8

NVIDIA Neural Modules 2.2.0rc1 Pre-release

Pre-release

Prerelease: NVIDIA Neural Modules 2.2.0rc1 (2025-02-04)

Assets 2

02 Feb 23:30

ko3n1g

v2.2.0rc0

2f66ada

NVIDIA Neural Modules 2.2.0rc0 Pre-release

Pre-release

Prerelease: NVIDIA Neural Modules 2.2.0rc0 (2025-02-02)

Assets 2

03 Jan 10:31

ko3n1g

v2.1.0

633cb60

NVIDIA Neural Modules 2.1.0

Highlights

Training
- Fault Tolerance
  - Straggler Detection
  - Auto Relaunch
LLM & MM
- MM models
  - Llava-next
  - Llama 3.2
- Sequence Model Parallel for NeVa
- Enable Energon
- SigLIP (NeMo 1.0 only)
- LLM 2.0 migration
  - Starcoder2
  - Gemma 2
  - T5
  - Baichuan
  - BERT
  - Mamba
  - ChatGLM
- DoRA support
Export
- Nemo 2.0 base model export path for NIM
- PTQ in Nemo 2.0
ASR
- Timestamps with TDT decoder
- Timestamps option with .transcribe()

Detailed Changelogs:

ASR

Changelog

[Fix] Fixed sampler override and audio_key in prepare_audio_data by @anteju :: PR: #10980
Akoumparouli/mixtral recipe fix r2.0.0 by @akoumpa :: PR: #10994
TDT compute timestamps option and Extra Whitespace handling for SPE by @monica-sekoyan :: PR: #10875
ci: Switch to CPU only runner by @ko3n1g :: PR: #11035
Fix timestamps tests by @monica-sekoyan :: PR: #11053
ci: Pin release freeze by @ko3n1g :: PR: #11143
Fix RNN-T loss memory usage by @artbataev :: PR: #11144
Added deprecation notice by @Ssofja :: PR: #11133
Fixes for Canary adapters tutorial by @pzelasko :: PR: #11184
add ipython import guard by @nithinraok :: PR: #11191
Self Supervised Pre-Training tutorial Fix by @monica-sekoyan :: PR: #11206
update the return type by @nithinraok :: PR: #11210
Timestamps to transcribe by @nithinraok :: PR: #10950
[Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
Beam search algorithm implementation for TDT models by @lilithgrigoryan :: PR: #10903
Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
Remove pytorch-lightning by @maanug-nv :: PR: #11306
update hypothesis when passed through cfg by @nithinraok :: PR: #11366
Revert "update hypothesis when passed through cfg" by @pablo-garay :: PR: #11373
Fix transcribe speech by @nithinraok :: PR: #11379
Lhotse support for transcribe_speech_parallel by @nune-tadevosyan :: PR: #11249
Sortformer Diarizer 4spk v1 model PR Part 1: models, modules and dataloaders by @tango4j :: PR: #11282
Removing unnecessary lines by @nune-tadevosyan :: PR: #11408
Support for initializing lhotse shar dataloader via field: list[path] mapping by @pzelasko :: PR: #11460
New extended prompt format for Canary, short utterances inference fix, and training micro-optimizations by @pzelasko :: PR: #11058
Fixing Multi_Task_Adapters.ipynb by replacing canary2 with canary_custom by @weiqingw4ng :: PR: #11636

TTS

Changelog

[Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
Add T5TTS by @blisc :: PR: #11193
Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
Remove pytorch-lightning by @maanug-nv :: PR: #11306
Add nvidia/low-frame-rate-speech-codec-22khz model on docs by @Edresson :: PR: #11457

NLP / NMT

Changelog

Move collectiob.nlp imports inline for t5 by @marcromeyn :: PR: #10877
Use a context-manager when opening files by @akoumpa :: PR: #10895
Packed sequence bug fixes by @cuichenx :: PR: #10898
ckpt convert bug fixes by @dimapihtar :: PR: #10878
remove deprecated ci tests by @dimapihtar :: PR: #10922
Update T5 tokenizer (adding additional tokens to tokenizer config) by @huvunvidia :: PR: #10972
Add support and recipes for HF models via AutoModelForCausalLM by @akoumpa :: PR: #10962
gpt3 175b cli by @malay-nagda :: PR: #10985
Fix for crash with LoRA + tp_overlap_comm=false + sequence_parallel=true by @vysarge :: PR: #10920
Update BaseMegatronSampler for compatibility with PTL's _BatchProgress by @ashors1 :: PR: #11016
add deprecation note by @dimapihtar :: PR: #11024
Update ModelOpt Width Pruning example defaults by @kevalmorabia97 :: PR: #10902
switch to NeMo 2.0 recipes by @dimapihtar :: PR: #10948
NeMo 1.0: upcycle dense to moe by @akoumpa :: PR: #11002
Gemma2 in Nemo2 with Recipes by @suiyoubi :: PR: #11037
Add Packed Seq option to GPT based models by @suiyoubi :: PR: #11100
Fix MCoreGPTModel import in llm.gpt.model.base by @hemildesai :: PR: #11109
TP+MoE peft fix by @akoumpa :: PR: #11114
GPT recipes to use full te spec by @JimmyZhang12 :: PR: #11119
Virtual pipeline parallel support for LoRA in NLPAdapterModelMixin by @vysarge :: PR: #11128
update nemo args for mcore flash decode arg change by @HuiyingLi :: PR: #11138
Call ckpt_to_weights_subdir from MegatronCheckpointIO by @ashors1 :: PR: #10897
[Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
fix(export): GPT models w/ bias=False convert properly by @terrykong :: PR: #11255
Use MegatronDataSampler in HfDatasetDataModule by @akoumpa :: PR: #11274
Add T5TTS by @blisc :: PR: #11193
ci: Exclude CPU machines from scan by @ko3n1g :: PR: #11300
Revert "fix(export): GPT models w/ bias=False convert properly" by @terrykong :: PR: #11301
remove redundant docs by @sharathts :: PR: #11302
Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
Add attention_bias argument in transformer block and transformer layer modules, addressing change in MCore by @yaoyu-33 :: PR: #11289
Remove pytorch-lightning by @maanug-nv :: PR: #11306
Update T5 attention-mask shapes to be compatible with all attention-backend in new TE versions by @huvunvidia :: PR: #11059
Add support for restoring from 2.0 checkpoint in 1.0 by @hemildesai :: PR: #11347
Fix Gemma2 Attention Args by @suiyoubi :: PR: #11365
mlm conversion & tiktokenizer support by @dimapihtar :: PR: #11349
[Nemo1] Generate sharded optimizer state dicts only if needed for saving by @ananthsub :: PR: #11451
add hindi tn/itn coverage by @mgrafu :: PR: #11382
chore(beep boop 🤖): Bump MCORE_TAG=67a50f2... (2024-11-28) by @ko3n1g :: PR: #11427
Handle exception when importing RetroGPTChunkDatasets by @guyueh1 :: PR: #11415
Update restore from config for gpt type continual training in NeMo1 by @yaoyu-33 :: PR: #11471
ci: Re-enable L2_Megatron_LM_To_NeMo_Conversion by @ko3n1g :: PR: #11484
Apply packed sequence params change for fused rope compatibility by @ananthsub :: PR: #11506
Huvu/tiktoken tokenizer update by @huvunvidia :: PR: #11494

Text Normalization / Inverse Text Normalization

Changelog

Adding support for LightningDataModule inside Fabric-API by @marcromeyn :: PR: #10879
Add registry to register all needed classes with artifacts in nemo.lightning.io by @hemildesai :: PR: #10861
Update import 'pytorch_lightning' -> 'lightning.pytorch' by @maanug-nv :: PR: #11252
Remove pytorch-lightning by @maanug-nv :: PR: #11306
add hindi tn/itn coverage by @mgrafu :: PR: #11382

Export

Changelog

Update engine build step for TRT-LLM 0.13.0 by @janekl :: PR: #10880
Nemo 2.0 ckpt support in TRT-LLM export by @oyilmaz-nvidia :: PR: #10891
Fix TRTLLM parallel_embedding by @meatybobby :: PR: #10975
Export & deploy updates (part I) by @janekl :: PR: #10941
Add doc-strings to import & export + improve logging by @marcromeyn :: PR: #11078
NeMo-UX: fix nemo-ux export path by @akoumpa :: PR: #11081
Fix TRTLLM nemo2 activation parsing by @meatybobby :: PR: #11062
Support exporting Nemotron-340B for TensorRT-LLM by @jinyangyuan-nvidia :: PR: #11015
vLLM Hugging Face exporter by @oyilmaz-nvidia :: PR: #11124
Fix export of configuration parameters to Weights and Biases by @soluwalana :: PR: #10995
Change activation parsing in TRTLLM by @meatybobby :: PR: #11173
Remove builder_opt param from trtllm-build for TensorRT-LLM >= 0.14.0 by @janekl :: PR: #11259
fix(export): GPT models w/ bias=False convert properly by @terrykong :: PR: #11255
fix(export): update API for disabling device reassignment in TRTLLM for Aligner by @terrykong :: PR: #10863
Add openai-gelu in gated activation for TRTLLM export by @meatybobby :: PR: #11293
Revert "fix(export): GPT models w/ bias=False convert properly" by @terrykong :: PR: #11301
Adding alinger export by @shanmugamr1992 :: PR: #11269
Export & deploy updates (part II) by @janekl :: PR: #11344
Introducing TensorRT lazy export and caching option with trt_compile() by @borisfom :: PR: #11266
fix: export converts properly if no model_prefix by @terrykong :: PR: #11477

Bugfixes

Changelog

Change default ckpt name by @maanug-nv :: PR: #11277
Fix patching of NeMo tokenizers for correct Lambada evaluation by @janekl :: PR: #11326

Uncategorized:

Changelog

ci: Use Slack group by @ko3n1g :: PR: #10866
Bump Dockerfile.ci (2024-10-14) by @ko3n1g :: PR: #10871
Fix peft resume by @cuichenx :: PR: #10887
call post_init after altering config values by @akoumpa :: PR: #10885
Late import prettytable by @maanug-nv :: PR: #10912
Bump Dockerfile.ci (2024-10-17) by @ko3n1g :: PR: #10919
Warning for missing FP8 checkpoint support for vLLM deployment by @janekl :: PR: #10906
Fix artifact saving by @hemildesai :: PR: #10914
Lora improvement by @cuichenx :: PR: #10918
Huvu/t5 nemo2.0 peft by @huvunvidia :: PR: #10916
perf recipes and Mcore DistOpt params by @malay-nagda :: PR: #10883
ci: Fix cherry pick team by @ko3n1g :: PR: #10945
Fix requirements for MacOS by @artbataev :: PR: #10930
Fix nemo 2.0 recipes by @BoxiangW :: PR: #10915
Akoumparouli/nemo ux fix dir or string artifact by @akoumpa :: PR: #10936
Fix typo in docstring by @ashors1 :: PR: #10955
[Nemo CICD] Remove deprecated tests by @pablo-garay :: PR: #10960
Restore NeMo 2.0 T5 pretraining CICD test by @huvunvidia :: PR: #10952
Convert perf plugin env vars to strings by @hemildesai :: PR: #10947
disable ...

Contributors

tfogal, soluwalana, and 64 other contributors

Assets 2

21 Dec 18:54

ko3n1g

v2.1.0rc2

49ef560

NVIDIA Neural Modules 2.1.0rc2 Pre-release

Pre-release

Prerelease: NVIDIA Neural Modules 2.1.0rc2 (2024-12-21)

Assets 2

20 Dec 08:48

ko3n1g

v2.1.0rc1

526a525

NVIDIA Neural Modules 2.1.0rc1 Pre-release

Pre-release

Prerelease: NVIDIA Neural Modules 2.1.0rc1 (2024-12-20)

Assets 2

11 Dec 23:16

pablo-garay

v2.1.0rc0

ceeafa4

NVIDIA Neural Modules 2.1.0rc0 Pre-release

Pre-release

[🤠]: Howdy folks, let's release NeMo `r2.1.0` ! (#11556)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>

Assets 2

14 Nov 18:57

ko3n1g

v2.0.0

e938df3

NVIDIA Neural Modules 2.0.0

Highlights

Large language models & Multi modal

Training
- Long context recipe
- PyTorch Native FSDP 1
Models
- Llama 3
- Mixtral
- Nemotron
NeMo 1.0
- SDXL (text-2-image)
- Model Opt
  - Depth Pruning (docs)
  - Logit based Knowledge Distillation (docs)

Export

TensorRT-LLM v0.12 integration
LoRA support for vLLM
FP8 checkpoint

ASR

Parakeet large (ASR with PnC model)
Added Uzbek offline and Gregorian streaming models
Optimization feature for efficient bucketing to improve bs consumption on GPUs

Detailed Changelogs

ASR

Changelog

add parakeet-tdt_ctc-110m model by @nithinraok :: PR: #10461
fix asr finetune by @stevehuang52 :: PR: #10508
replace unbiased with correction by @nithinraok :: PR: #10555
Update Multi_Task_Adapters.ipynb by @pzelasko :: PR: #10600
Fix asr warnings by @nithinraok :: PR: #10469
Fix typo in ASR RNNT BPE model by @pzelasko :: PR: #10742
TestEncDecMultiTaskModel for canary parallel by @karpnv :: PR: #10740
fix chunked infer by @stevehuang52 :: PR: #10581
training code for hybrid-autoregressive inference model by @hainan-xv :: PR: #10841
remove stacking operation from batched functions by @lilithgrigoryan :: PR: #10524
Add lhotse fixes for rnnt model training and WER hanging issue with f… by @nithinraok :: PR: #10821
Fix ASR tests by @artbataev :: PR: #10794
[Fix] Fixed sampler override and audio_key in prepare_audio_data by @anteju :: PR: #10980
[WIP] Add docs for NEST SSL by @stevehuang52 :: PR: #10804
Akoumparouli/mixtral recipe fix r2.0.0 by @akoumpa :: PR: #10994
TDT compute timestamps option and Extra Whitespace handling for SPE by @monica-sekoyan :: PR: #10875
ci: Switch to CPU only runner by @ko3n1g :: PR: #11035
Fix timestamps tests by @monica-sekoyan :: PR: #11053
ci: Pin release freeze by @ko3n1g :: PR: #11143
Fix RNN-T loss memory usage by @artbataev :: PR: #11144
Added deprecation notice by @Ssofja :: PR: #11133
Fixes for Canary adapters tutorial by @pzelasko :: PR: #11184
add ipython import guard by @nithinraok :: PR: #11191
Self Supervised Pre-Training tutorial Fix by @monica-sekoyan :: PR: #11206
update the return type by @nithinraok :: PR: #11210
Timestamps to transcribe by @nithinraok :: PR: #10950
[Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
Beam search algorithm implementation for TDT models by @lilithgrigoryan :: PR: #10903

TTS

Changelog

Fix asr warnings by @nithinraok :: PR: #10469
Make nemo text processing optional in TTS by @blisc :: PR: #10584
[Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045

NLP / NMT

Changelog

Bump Dockerfile.ci (2024-09-09) by @ko3n1g :: PR: #10423
MCORE interface for TP-only FP8 AMAX reduction by @erhoo82 :: PR: #10437
Remove Apex dependency if not using MixedFusedLayerNorm by @cuichenx :: PR: #10468
Add missing import guards for causal_conv1d and mamba_ssm dependencies by @janekl :: PR: #10429
Update doc for fp8 trt-llm export by @Laplasjan107 :: PR: #10444
set mock in GPTDatasetConfig by @akoumpa :: PR: #10435
Remove running validating after finetuning by @huvunvidia :: PR: #10560
Extending modelopt spec for TEDotProductAttention by @janekl :: PR: #10523
Fix mb_calculator import in lora tutorial by @BoxiangW :: PR: #10624
.nemo conversion bug fix by @dimapihtar :: PR: #10598
Updating modelopt spec for Mixtral by @janekl :: PR: #10660
Require setuptools>=70 and update deprecated api by @thomasdhc :: PR: #10659
Akoumparouli/fix get tokenizer list by @akoumpa :: PR: #10596
[McoreDistOptim] fix the naming to match apex.dist by @gdengk :: PR: #10707
[fix] Ensures disabling exp_manager with exp_manager=null does not error by @terrykong :: PR: #10651
[feat] Update get_model_parallel_src_rank to support tp-pp-dp ordering by @terrykong :: PR: #10652
feat: Migrate GPTSession refit path in Nemo export to ModelRunner for Aligner by @terrykong :: PR: #10654
[MCoreDistOptim] Add assertions for McoreDistOptim and fix fp8 arg specs by @gdengk :: PR: #10748
Fix for crashes with tensorboard_logger=false and VP + LoRA by @vysarge :: PR: #10792
Adding init_model_parallel to FabricMegatronStrategy by @marcromeyn :: PR: #10733
Moving steps to MegatronParallel to improve UX for Fabric by @marcromeyn :: PR: #10732
Adding setup_megatron_optimizer to FabricMegatronStrategy by @marcromeyn :: PR: #10833
Make FabricMegatronMixedPrecision match MegatronMixedPrecision by @marcromeyn :: PR: #10835
Fix VPP bug in MegatronStep by @marcromeyn :: PR: #10847
Expose drop_last in MegatronDataSampler by @farhadrgh :: PR: #10837
Move collectiob.nlp imports inline for t5 by @marcromeyn :: PR: #10877
Use a context-manager when opening files by @akoumpa :: PR: #10895
Packed sequence bug fixes by @cuichenx :: PR: #10898
ckpt convert bug fixes by @dimapihtar :: PR: #10878
remove deprecated ci tests by @dimapihtar :: PR: #10922
Adithyare/oai chat completion by @arendu :: PR: #10785
Update T5 tokenizer (adding additional tokens to tokenizer config) by @huvunvidia :: PR: #10972
Add support and recipes for HF models via AutoModelForCausalLM by @akoumpa :: PR: #10962
- gpt3 175b cli by @malay-nagda :: PR: #10985
- Fix for crash with LoRA + tp_overlap_comm=false + sequence_parallel=true by @vysarge :: PR: #10920
- Update BaseMegatronSampler for compatibility with PTL'''s _BatchProgress by @ashors1 :: PR: #11016
- add deprecation note by @dimapihtar :: PR: #11024
- Update ModelOpt Width Pruning example defaults by @kevalmorabia97 :: PR: #10902
- switch to NeMo 2.0 recipes by @dimapihtar :: PR: #10948
- NeMo 1.0: upcycle dense to moe by @akoumpa :: PR: #11002
- Update mcore parallelism initialization in nemo2 by @yaoyu-33 :: PR: #10643
- Gemma2 in Nemo2 with Recipes by @suiyoubi :: PR: #11037
- Add Packed Seq option to GPT based models by @suiyoubi :: PR: #11100
- Fix MCoreGPTModel import in llm.gpt.model.base by @hemildesai :: PR: #11109
- TP+MoE peft fix by @akoumpa :: PR: #11114
- GPT recipes to use full te spec by @JimmyZhang12 :: PR: #11119
- Virtual pipeline parallel support for LoRA in NLPAdapterModelMixin by @vysarge :: PR: #11128
- update nemo args for mcore flash decode arg change by @HuiyingLi :: PR: #11138
- Call ckpt_to_weights_subdir from MegatronCheckpointIO by @ashors1 :: PR: #10897
- fix typo by @dimapihtar :: PR: #11234
- [Doc fixes] update file names, installation instructions, bad links by @erastorgueva-nv :: PR: #11045
- fix(export): GPT models w/ bias=False convert properly by @terrykong :: PR: #11255

Contributors

HuiyingLi, karpnv, and 34 other contributors

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Highlights

Detailed Changelogs:

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

NeMo Tools

Export

Bugfixes

Uncategorized:

Contributors

Highlights

Detailed Changelogs:

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

Export

Bugfixes

Uncategorized:

Contributors

Highlights

Large language models & Multi modal

Export

ASR

Detailed Changelogs

ASR

TTS

NLP / NMT

Contributors

Releases: NVIDIA/NeMo

NVIDIA Neural Modules 2.2.0

Highlights

Detailed Changelogs:

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

NeMo Tools

Export

Bugfixes

Uncategorized:

Contributors

NVIDIA Neural Modules 2.2.0rc3

NVIDIA Neural Modules 2.2.0rc2

NVIDIA Neural Modules 2.2.0rc1

NVIDIA Neural Modules 2.2.0rc0

NVIDIA Neural Modules 2.1.0

Highlights

Detailed Changelogs:

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

Export

Bugfixes

Uncategorized:

Contributors

NVIDIA Neural Modules 2.1.0rc2

NVIDIA Neural Modules 2.1.0rc1

NVIDIA Neural Modules 2.1.0rc0

NVIDIA Neural Modules 2.0.0

Highlights

Large language models & Multi modal

Export

ASR

Detailed Changelogs

ASR

TTS

NLP / NMT

Contributors