Commit graph

361 commits

Author SHA1 Message Date
Samuel Kriman b7a175b7b9
Self-supervised pre-training for speech models (#3139)
* self-supervised training

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* test

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove imports

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* sort imports

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix audio_to_text

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* manifest handle no text

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* loss init

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* style

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove tokenizer from config

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* config changes

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* remove hydra import

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* always spec augment

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fixes

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* copyright

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix cosine sim

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix cosine sim

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* fix cosine sim

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* changes based on comments

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* changes based on comments

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* configs

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* name fix

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* ci config changes

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* renamed to num_negatives

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* minor changes

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* name changes, type annotations

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
2021-11-10 15:33:11 -08:00
Nithin Rao dc9ed88f78
Modify speaker input (#3100)
* initial_commit

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* init diarizer

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* vad+speaker

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* vad update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* speaker done

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* initial working version

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* compare outputs

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* added uem support

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* pyannote improvements

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* updated config and script name

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update Jenkins file

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* jenkins fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* jenkins fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update file path in jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update file path in jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update file path in jenkins

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* jenkins quote fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update offline speaker diarization notebook

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* intial working asr_with_diarization

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* almost done, revist scoring part

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* fixed eval in offline diarization with asr

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update write2manifest to consider only up to max audio duration

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* asr with diarization notebook

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Fixed ASR_with_diarization tutorial.ipynb and diarization_utils and edited config yaml file

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed VAD parameters in Speaker_Diarization_Inference.ipynb

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Added Jenkins test, doc strings and updated README

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update jenkins test

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Doc info in offline_diarization_with_asr

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Review comments

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update outdir paths

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Co-authored-by: Taejin Park <tango4j@gmail.com>
2021-11-06 10:55:32 -04:00
Yang Zhang 3fe7308a37
Tn add nn wfst and doc (#3135)
* made tagger exportable

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* added whitelist wfst for nn

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated documentation

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* remove experimental

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* made tagger exportable

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* added whitelist wfst for nn

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated documentation

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* remove experimental

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* preserve punct after nn wfst

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-11-04 11:48:26 -07:00
Eric Harper aaacc4b089
Merge r1.5.0 bugfixes and doc updates to main (#3133)
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Always save last checkpoint on train end even if folder does not exist (#2976)

* add fix for no checkpoint folder when training ends

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>

* fix test

Signed-off-by: Jason <jasoli@nvidia.com>

* fixes

Signed-off-by: Jason <jasoli@nvidia.com>

* typo

Signed-off-by: Jason <jasoli@nvidia.com>

* change check

Signed-off-by: Jason <jasoli@nvidia.com>

* [NLP] Add Apex import guard (#3041)

* add apex import guard

Signed-off-by: ericharper <complex451@gmail.com>

* add apex import guard

Signed-off-by: ericharper <complex451@gmail.com>

* add apex import guard

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* remove from init add logging to constructor

Signed-off-by: ericharper <complex451@gmail.com>

* remove from init add logging to constructor

Signed-off-by: ericharper <complex451@gmail.com>

* remove import from init

Signed-off-by: ericharper <complex451@gmail.com>

* remove megatron bert encoder logic from NLPModel

Signed-off-by: ericharper <complex451@gmail.com>

* remove megatron bert from init

Signed-off-by: ericharper <complex451@gmail.com>

* remove megatron bert from init

Signed-off-by: ericharper <complex451@gmail.com>

* remove megatron bert from init

Signed-off-by: ericharper <complex451@gmail.com>

* remove megatron bert from init

Signed-off-by: ericharper <complex451@gmail.com>

* remove megatron bert from init

Signed-off-by: ericharper <complex451@gmail.com>

* remove megatron bert from init

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* Exp manager small refactor (#3067)

* Exp manager small refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* move super() call earlier in the function

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* Change container (#3087)

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Training of machine translation model fails if config parameter `trainer.max_epochs` is used instead of `trainer.max_steps`. (#3112)

* fix: replace distributed_backend for accelarator

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add debug script

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove debug script

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* update (#3113)

Signed-off-by: Jason <jasoli@nvidia.com>

* Fix: punctuation capitalization inference on short queries (#3111)

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Multiple ASR Fixes to SPE tokenization (#3119)

* Reduce num workers for transcribe

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix SPE tokenizer vocabulary construction

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update tokenizer building script

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove logs

Signed-off-by: smajumdar <titu1994@gmail.com>

* Megatron GPT training in BCP (#3095)

* BCP megatron training

Signed-off-by: madhukar <madhukar@penguin>

* Add quotes

Signed-off-by: madhukar <madhukar@penguin>

* Style fix

Signed-off-by: madhukar <madhukar@penguin>

Co-authored-by: madhukar <madhukar@penguin>

* Upgrade to PTL 1.5.0 (#3127)

* update for ptl 1.5.0

Signed-off-by: ericharper <complex451@gmail.com>

* update trainer config

Signed-off-by: ericharper <complex451@gmail.com>

* limit cuda visible devices to the first two gpus on check for ranks CI test

Signed-off-by: ericharper <complex451@gmail.com>

* remove comments

Signed-off-by: ericharper <complex451@gmail.com>

* make datasets larger for test

Signed-off-by: ericharper <complex451@gmail.com>

* make datasets larger for test

Signed-off-by: ericharper <complex451@gmail.com>

* update compute_max_steps

Signed-off-by: ericharper <complex451@gmail.com>

* update compute_max_steps

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* remove duplicate code

Signed-off-by: ericharper <complex451@gmail.com>

* remove comment

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Madhukar K <26607911+madhukarkm@users.noreply.github.com>
Co-authored-by: madhukar <madhukar@penguin>
2021-11-04 10:26:58 -06:00
Eric Harper 574b1014fd
Merge r1.5.0 bugfixes and doc updates to main (#3093)
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix quantization bug in Asr (#3062)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update reinstall and cherry-pick bignlp commits (#3065)

* add ptl install to reinstall and update jenkins install

Signed-off-by: ericharper <complex451@gmail.com>

* Add a stateless timer to specify max_time per run instead of global m… (#3056)

* Add a stateless timer to specify max_time per run instead of global max_time across runs

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* (1) reduce the validation loss within a epoch, (2) convert global-batch-based iteartion counts to micro-batch-based (#3055)

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Timer class monitors total time (train + validation + testing) to monitor when to end training (#3061)

* Check total time in train/validation to exit

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>

* Add PUBLICATIONS.md (#3051)

* Add PUBLICATIONS.md

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add NLP

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update PUBLICATIONS.md

* Update PUBLICATIONS.md

* Fix links

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* fix uninstall

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* fix File Load Error (#3069)

Signed-off-by: fayejf <fayejf07@gmail.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Update hyper parameter saving (#3058)

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Exp manager small refactor (#3067)

* Exp manager small refactor

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* move super() call earlier in the function

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix FastPitch Pitch Duration Notebook (#3068)

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* bugfix2

Signed-off-by: Jason <jasoli@nvidia.com>

* better check

Signed-off-by: Jason <jasoli@nvidia.com>

* confusionmatrix (#3085)

Signed-off-by: fayejf <fayejf07@gmail.com>

* typo and fix link (#3086)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* inf cross-checking across tensor-parallel ranks (#3088)

* inf cross-checking across tensor-parallel ranks

* sylte

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Fix save top k (#3075)

* inject mp_rank for checkpoint paths in NLPDDPPlugin

Signed-off-by: ericharper <complex451@gmail.com>

* == instead of i

Signed-off-by: ericharper <complex451@gmail.com>

* when checking previous run account for mp

Signed-off-by: ericharper <complex451@gmail.com>

* uninject mp ranks when needed

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
2021-10-29 20:15:37 -06:00
Yang Zhang 42b167ee2e
Hg cache (#3080)
* add cache for huggingface

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* change cache location

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>
2021-10-28 18:04:45 -06:00
Somshubra Majumdar 1f36f32ee6
Remove STFT checks due to min PT version of 1.10 (#3034)
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-10-21 15:29:21 -07:00
Eric Harper 32fa5cfaf3
[BigNLP] Merge Megatron GPT to main (#2975)
* fix gpu init after removing debug print in mpu

Signed-off-by: ericharper <complex451@gmail.com>

* add fused_adam

Signed-off-by: ericharper <complex451@gmail.com>

* check ds is not none before logging len

Signed-off-by: ericharper <complex451@gmail.com>

* set fp16 arg to true and fix enum conflict

Signed-off-by: ericharper <complex451@gmail.com>

* make fp16 arg configurable

Signed-off-by: ericharper <complex451@gmail.com>

* add grad clip from megatron

Signed-off-by: ericharper <complex451@gmail.com>

* Linear warmup with cosine annealing and constant holding (#2846)

* Testing cosine schedule

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update config for constant steps in schedule

Signed-off-by: ericharper <complex451@gmail.com>

* temporarily import enum from megatron

Signed-off-by: ericharper <complex451@gmail.com>

* add grad clip for fp32

Signed-off-by: ericharper <complex451@gmail.com>

* update check for _del_model_without_trainer

Signed-off-by: ericharper <complex451@gmail.com>

* updating restore for model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* add predict script

Signed-off-by: ericharper <complex451@gmail.com>

* update test iters

Signed-off-by: ericharper <complex451@gmail.com>

* add barrier

Signed-off-by: ericharper <complex451@gmail.com>

* return if clip_val is 0 or None

Signed-off-by: ericharper <complex451@gmail.com>

* when using amp clip grads after they are unscaled

Signed-off-by: ericharper <complex451@gmail.com>

* make native amp scaler hyperparams configurable

Signed-off-by: ericharper <complex451@gmail.com>

* (1) nvfuser, (2) amp-casting decoration (#2894)

* (1) nvfuser, (2) amp-casting decoration

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* support bf16

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* add set device to constructor

Signed-off-by: ericharper <complex451@gmail.com>

* set_device in constructor

Signed-off-by: ericharper <complex451@gmail.com>

* [BigNLP] Remove megatron-lm dependency. (#2910)

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update megatron_init

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update process batch

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron clip_grad

Signed-off-by: ericharper <complex451@gmail.com>

* trying to resolve circular import error

Signed-off-by: ericharper <complex451@gmail.com>

* rename file

Signed-off-by: ericharper <complex451@gmail.com>

* remove non-gpt models and datasets from __init__ files

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set_device in constructor

Signed-off-by: ericharper <complex451@gmail.com>

* clean config

Signed-off-by: ericharper <complex451@gmail.com>

* update MegatronDataset

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16

Signed-off-by: ericharper <complex451@gmail.com>

* rename to fused_fp16

Signed-off-by: ericharper <complex451@gmail.com>

* add fused_fp16 arg to LayerNorm calls

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix import

Signed-off-by: ericharper <complex451@gmail.com>

* update arg

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* Adding complete method to MegatronGPTModel (#2935)

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* make ffn_hidden_size mandatory

Signed-off-by: ericharper <complex451@gmail.com>

* Manually migrating timing of step into branch (#2937)

* 1. Manually migrating timing of step into branch.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated file name and content.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated to latest code.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

Co-authored-by: Micha Livne <mlivne@nvidia.com>

* remove unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* check fused_fp16 and fused_bf16 are not both True

Signed-off-by: ericharper <complex451@gmail.com>

* update predict script for model parallel .nemo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>

* NVfuser (#2943)

* activation checkpoint recompute

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* selective nvfuser setup

* Megatron gpt bfloat support (#2926)

* Save/restore fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Another merge

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Bf16 args in init

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Set precision

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove debug stuff

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* add bf16 casting decorator

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* Bfloat layernorm propagation

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* activation checkpoint recompute

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* selective nvfuser setup

* More arg removal

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove BERTDataset

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update to latest apex and patch transformer autocast

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>

* don't set jit for bf16

Signed-off-by: ericharper <complex451@gmail.com>

* replace apex.mpu

Signed-off-by: ericharper <complex451@gmail.com>

* fix grad clip

Signed-off-by: ericharper <complex451@gmail.com>

* NVFuser fixes (#2951)

* Fuser fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove dummy handler

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove PTL plugin based logic for fusion

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* remove duplicated file

Signed-off-by: ericharper <complex451@gmail.com>

* typo (#2960)

Signed-off-by: ericharper <complex451@gmail.com>

* [BigNLP] Script to convert GPT checkpoint to .nemo (#2958)

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update megatron_init

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update process batch

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron clip_grad

Signed-off-by: ericharper <complex451@gmail.com>

* trying to resolve circular import error

Signed-off-by: ericharper <complex451@gmail.com>

* rename file

Signed-off-by: ericharper <complex451@gmail.com>

* remove non-gpt models and datasets from __init__ files

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set_device in constructor

Signed-off-by: ericharper <complex451@gmail.com>

* clean config

Signed-off-by: ericharper <complex451@gmail.com>

* update MegatronDataset

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16

Signed-off-by: ericharper <complex451@gmail.com>

* rename to fused_fp16

Signed-off-by: ericharper <complex451@gmail.com>

* add fused_fp16 arg to LayerNorm calls

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix import

Signed-off-by: ericharper <complex451@gmail.com>

* update arg

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* Adding complete method to MegatronGPTModel (#2935)

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* make ffn_hidden_size mandatory

Signed-off-by: ericharper <complex451@gmail.com>

* Manually migrating timing of step into branch (#2937)

* 1. Manually migrating timing of step into branch.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated file name and content.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated to latest code.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

Co-authored-by: Micha Livne <mlivne@nvidia.com>

* remove unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* check fused_fp16 and fused_bf16 are not both True

Signed-off-by: ericharper <complex451@gmail.com>

* update predict script for model parallel .nemo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* add script to convert .ckpt to .nemo

Signed-off-by: ericharper <complex451@gmail.com>

* in progress

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* convert mp checkpoints to nemo

Signed-off-by: ericharper <complex451@gmail.com>

* update help

Signed-off-by: ericharper <complex451@gmail.com>

* add safeguard for model parallel save_to

Signed-off-by: ericharper <complex451@gmail.com>

* adjust NLPModel save_to to be safer for model parallel

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* [BigNLP] Update GPT evaluation to work with tensor model parallel  (#2959)

* in progress

Signed-off-by: ericharper <complex451@gmail.com>

* update args

Signed-off-by: ericharper <complex451@gmail.com>

* add request dataset

Signed-off-by: ericharper <complex451@gmail.com>

* tokenize request

Signed-off-by: ericharper <complex451@gmail.com>

* in progress

Signed-off-by: ericharper <complex451@gmail.com>

* able to run

Signed-off-by: ericharper <complex451@gmail.com>

* reduce logits

Signed-off-by: ericharper <complex451@gmail.com>

* capture response

Signed-off-by: ericharper <complex451@gmail.com>

* squeeze and unsqueeze

Signed-off-by: ericharper <complex451@gmail.com>

* handle non model parallel case

Signed-off-by: ericharper <complex451@gmail.com>

* clean imports

Signed-off-by: ericharper <complex451@gmail.com>

* add file

Signed-off-by: ericharper <complex451@gmail.com>

* convert logits to log_probs

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* rename logits to log_probs

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* add megatron gpt pretraining

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron gpt pretraining

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron gpt pretraining

Signed-off-by: ericharper <complex451@gmail.com>

* updating to work with latest megatron

Signed-off-by: ericharper <complex451@gmail.com>

* updating to work with latest megatron

Signed-off-by: ericharper <complex451@gmail.com>

* update _del_model

Signed-off-by: ericharper <complex451@gmail.com>

* adding gpt model

Signed-off-by: ericharper <complex451@gmail.com>

* adding gpt model

Signed-off-by: ericharper <complex451@gmail.com>

* adding gpt model

Signed-off-by: ericharper <complex451@gmail.com>

* instantiate GPTmodel

Signed-off-by: ericharper <complex451@gmail.com>

* adding build dataset

Signed-off-by: ericharper <complex451@gmail.com>

* build megatron dataset in .setup

Signed-off-by: ericharper <complex451@gmail.com>

* setup dataloader

Signed-off-by: ericharper <complex451@gmail.com>

* add vocab_file and merge_file to megatron init

Signed-off-by: ericharper <complex451@gmail.com>

* add forward

Signed-off-by: ericharper <complex451@gmail.com>

* add train loss

Signed-off-by: ericharper <complex451@gmail.com>

* add optimizer

Signed-off-by: ericharper <complex451@gmail.com>

* add exp_manager

Signed-off-by: ericharper <complex451@gmail.com>

* multi-gpu is working

Signed-off-by: ericharper <complex451@gmail.com>

* adding val loop

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* adding val loop

Signed-off-by: ericharper <complex451@gmail.com>

* fix ranks

Signed-off-by: ericharper <complex451@gmail.com>

* fix model parallel checkpoint saving

Signed-off-by: ericharper <complex451@gmail.com>

* fix _del_model

Signed-off-by: ericharper <complex451@gmail.com>

* added megatron batch sampler

Signed-off-by: ericharper <complex451@gmail.com>

* try to fix num steps

Signed-off-by: ericharper <complex451@gmail.com>

* add wandb to config

Signed-off-by: ericharper <complex451@gmail.com>

* log lr

Signed-off-by: ericharper <complex451@gmail.com>

* add warmup ratio to config

Signed-off-by: ericharper <complex451@gmail.com>

* update configs

Signed-off-by: ericharper <complex451@gmail.com>

* update configs

Signed-off-by: ericharper <complex451@gmail.com>

* add cpu init to args

Signed-off-by: ericharper <complex451@gmail.com>

* update config

Signed-off-by: ericharper <complex451@gmail.com>

* update config

Signed-off-by: ericharper <complex451@gmail.com>

* Initial megatron dataset port

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix merge conflicts

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* License fixes and megatron model porting

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes to import from nemo rather than megatron

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix circular imports

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert config file

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Restructure further to avoid circular imports

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* add Makefile

Signed-off-by: ericharper <complex451@gmail.com>

* Add megatron modules

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* add license

Signed-off-by: ericharper <complex451@gmail.com>

* Port from latest megatron

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update cfg

Signed-off-by: ericharper <complex451@gmail.com>

* update config

Signed-off-by: ericharper <complex451@gmail.com>

* add _del_model_without_trainer

Signed-off-by: ericharper <complex451@gmail.com>

* add data preprocessing script

Signed-off-by: ericharper <complex451@gmail.com>

* update config

Signed-off-by: ericharper <complex451@gmail.com>

* use apex mpu

Signed-off-by: ericharper <complex451@gmail.com>

* replace print_rank_0 with nemo utils logging

Signed-off-by: ericharper <complex451@gmail.com>

* use apex mpu

Signed-off-by: ericharper <complex451@gmail.com>

* use apex mpu

Signed-off-by: ericharper <complex451@gmail.com>

* add use_cpu_initialization

Signed-off-by: ericharper <complex451@gmail.com>

* fixing autoresume in progress

Signed-off-by: ericharper <complex451@gmail.com>

* properly removing last checkpoint

Signed-off-by: ericharper <complex451@gmail.com>

* log consumed samples

Signed-off-by: ericharper <complex451@gmail.com>

* fix mp autoresume

Signed-off-by: ericharper <complex451@gmail.com>

* add NLPSaveRestoreConnector

Signed-off-by: ericharper <complex451@gmail.com>

* Megatron GPT training with NeMo tokenizers (#2818)

* Update files from megatron repo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove non NLP data related files from megatron

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Merge megatron and nemo tokenizers

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove get_tokenizer() calls from gpt model

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update tokenizer yaml config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* add todo

Signed-off-by: ericharper <complex451@gmail.com>

* update config

Signed-off-by: ericharper <complex451@gmail.com>

* make init_method_std configurable

Signed-off-by: ericharper <complex451@gmail.com>

* make gpu init work by setting random seed earlier

Signed-off-by: ericharper <complex451@gmail.com>

* fix gpu init after removing debug print in mpu

Signed-off-by: ericharper <complex451@gmail.com>

* add fused_adam

Signed-off-by: ericharper <complex451@gmail.com>

* check ds is not none before logging len

Signed-off-by: ericharper <complex451@gmail.com>

* set fp16 arg to true and fix enum conflict

Signed-off-by: ericharper <complex451@gmail.com>

* make fp16 arg configurable

Signed-off-by: ericharper <complex451@gmail.com>

* add grad clip from megatron

Signed-off-by: ericharper <complex451@gmail.com>

* Linear warmup with cosine annealing and constant holding (#2846)

* Testing cosine schedule

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* More fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update config for constant steps in schedule

Signed-off-by: ericharper <complex451@gmail.com>

* temporarily import enum from megatron

Signed-off-by: ericharper <complex451@gmail.com>

* add grad clip for fp32

Signed-off-by: ericharper <complex451@gmail.com>

* update check for _del_model_without_trainer

Signed-off-by: ericharper <complex451@gmail.com>

* updating restore for model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* add predict script

Signed-off-by: ericharper <complex451@gmail.com>

* update test iters

Signed-off-by: ericharper <complex451@gmail.com>

* add barrier

Signed-off-by: ericharper <complex451@gmail.com>

* return if clip_val is 0 or None

Signed-off-by: ericharper <complex451@gmail.com>

* when using amp clip grads after they are unscaled

Signed-off-by: ericharper <complex451@gmail.com>

* make native amp scaler hyperparams configurable

Signed-off-by: ericharper <complex451@gmail.com>

* (1) nvfuser, (2) amp-casting decoration (#2894)

* (1) nvfuser, (2) amp-casting decoration

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* support bf16

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* add set device to constructor

Signed-off-by: ericharper <complex451@gmail.com>

* set_device in constructor

Signed-off-by: ericharper <complex451@gmail.com>

* [BigNLP] Remove megatron-lm dependency. (#2910)

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update megatron_init

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update process batch

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron clip_grad

Signed-off-by: ericharper <complex451@gmail.com>

* trying to resolve circular import error

Signed-off-by: ericharper <complex451@gmail.com>

* rename file

Signed-off-by: ericharper <complex451@gmail.com>

* remove non-gpt models and datasets from __init__ files

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set_device in constructor

Signed-off-by: ericharper <complex451@gmail.com>

* clean config

Signed-off-by: ericharper <complex451@gmail.com>

* update MegatronDataset

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16

Signed-off-by: ericharper <complex451@gmail.com>

* rename to fused_fp16

Signed-off-by: ericharper <complex451@gmail.com>

* add fused_fp16 arg to LayerNorm calls

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix import

Signed-off-by: ericharper <complex451@gmail.com>

* update arg

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* Adding complete method to MegatronGPTModel (#2935)

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* make ffn_hidden_size mandatory

Signed-off-by: ericharper <complex451@gmail.com>

* Manually migrating timing of step into branch (#2937)

* 1. Manually migrating timing of step into branch.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated file name and content.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated to latest code.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

Co-authored-by: Micha Livne <mlivne@nvidia.com>

* remove unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* check fused_fp16 and fused_bf16 are not both True

Signed-off-by: ericharper <complex451@gmail.com>

* update predict script for model parallel .nemo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>

* NVfuser (#2943)

* activation checkpoint recompute

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* selective nvfuser setup

* Megatron gpt bfloat support (#2926)

* Save/restore fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Another merge

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Bf16 args in init

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Set precision

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove debug stuff

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* add bf16 casting decorator

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* Bfloat layernorm propagation

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* activation checkpoint recompute

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* selective nvfuser setup

* More arg removal

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove BERTDataset

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update to latest apex and patch transformer autocast

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>

* don't set jit for bf16

Signed-off-by: ericharper <complex451@gmail.com>

* replace apex.mpu

Signed-off-by: ericharper <complex451@gmail.com>

* fix grad clip

Signed-off-by: ericharper <complex451@gmail.com>

* NVFuser fixes (#2951)

* Fuser fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove dummy handler

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Remove PTL plugin based logic for fusion

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* remove duplicated file

Signed-off-by: ericharper <complex451@gmail.com>

* typo (#2960)

Signed-off-by: ericharper <complex451@gmail.com>

* [BigNLP] Script to convert GPT checkpoint to .nemo (#2958)

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* remove args in progress

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add load_fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update megatron_init

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* add fused kernels

Signed-off-by: ericharper <complex451@gmail.com>

* update process batch

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* remove erroneous import

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron clip_grad

Signed-off-by: ericharper <complex451@gmail.com>

* trying to resolve circular import error

Signed-off-by: ericharper <complex451@gmail.com>

* rename file

Signed-off-by: ericharper <complex451@gmail.com>

* remove non-gpt models and datasets from __init__ files

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set device in constructorfor gpu init

Signed-off-by: ericharper <complex451@gmail.com>

* set_device in constructor

Signed-off-by: ericharper <complex451@gmail.com>

* clean config

Signed-off-by: ericharper <complex451@gmail.com>

* update MegatronDataset

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* clean up MegatronModule

Signed-off-by: ericharper <complex451@gmail.com>

* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16

Signed-off-by: ericharper <complex451@gmail.com>

* rename to fused_fp16

Signed-off-by: ericharper <complex451@gmail.com>

* add fused_fp16 arg to LayerNorm calls

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix arg name

Signed-off-by: ericharper <complex451@gmail.com>

* fix import

Signed-off-by: ericharper <complex451@gmail.com>

* update arg

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* skip warmup default to True

Signed-off-by: ericharper <complex451@gmail.com>

* Adding complete method to MegatronGPTModel (#2935)

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* make ffn_hidden_size mandatory

Signed-off-by: ericharper <complex451@gmail.com>

* Manually migrating timing of step into branch (#2937)

* 1. Manually migrating timing of step into branch.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated file name and content.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated to latest code.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

Co-authored-by: Micha Livne <mlivne@nvidia.com>

* remove unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused import

Signed-off-by: ericharper <complex451@gmail.com>

* check fused_fp16 and fused_bf16 are not both True

Signed-off-by: ericharper <complex451@gmail.com>

* update predict script for model parallel .nemo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* add script to convert .ckpt to .nemo

Signed-off-by: ericharper <complex451@gmail.com>

* in progress

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* convert mp checkpoints to nemo

Signed-off-by: ericharper <complex451@gmail.com>

* update help

Signed-off-by: ericharper <complex451@gmail.com>

* add safeguard for model parallel save_to

Signed-off-by: ericharper <complex451@gmail.com>

* adjust NLPModel save_to to be safer for model parallel

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* [BigNLP] Update GPT evaluation to work with tensor model parallel  (#2959)

* in progress

Signed-off-by: ericharper <complex451@gmail.com>

* update args

Signed-off-by: ericharper <complex451@gmail.com>

* add request dataset

Signed-off-by: ericharper <complex451@gmail.com>

* tokenize request

Signed-off-by: ericharper <complex451@gmail.com>

* in progress

Signed-off-by: ericharper <complex451@gmail.com>

* able to run

Signed-off-by: ericharper <complex451@gmail.com>

* reduce logits

Signed-off-by: ericharper <complex451@gmail.com>

* capture response

Signed-off-by: ericharper <complex451@gmail.com>

* squeeze and unsqueeze

Signed-off-by: ericharper <complex451@gmail.com>

* handle non model parallel case

Signed-off-by: ericharper <complex451@gmail.com>

* clean imports

Signed-off-by: ericharper <complex451@gmail.com>

* add file

Signed-off-by: ericharper <complex451@gmail.com>

* convert logits to log_probs

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* rename logits to log_probs

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* fix copyright headers

Signed-off-by: ericharper <complex451@gmail.com>

* fix copyright headers

Signed-off-by: ericharper <complex451@gmail.com>

* remove old TimingCallback

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins to use latest apex and sandeep's fork

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* try 2109 container

Signed-off-by: ericharper <complex451@gmail.com>

* try cuda container

Signed-off-by: ericharper <complex451@gmail.com>

* use internal container

Signed-off-by: ericharper <complex451@gmail.com>

* update checkpoint tests

Signed-off-by: ericharper <complex451@gmail.com>

* fix scheduler args

Signed-off-by: ericharper <complex451@gmail.com>

* update eval

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins to use ptl 1.5 rc

Signed-off-by: ericharper <complex451@gmail.com>

* add import guard to jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* add import guard to jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* remove deterministic

Signed-off-by: ericharper <complex451@gmail.com>

* install numba .53

Signed-off-by: ericharper <complex451@gmail.com>

* allow for more variance

Signed-off-by: ericharper <complex451@gmail.com>

* update trainer config dataclass

Signed-off-by: ericharper <complex451@gmail.com>

* test_get_optimizer on gpu

Signed-off-by: ericharper <complex451@gmail.com>

* revert comment

Signed-off-by: ericharper <complex451@gmail.com>

* change trainer config default to 32

Signed-off-by: ericharper <complex451@gmail.com>

* [BigNLP] Remove fused kernel code instead use Apex (#2984)

* remove fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* remove fused_kernels

Signed-off-by: ericharper <complex451@gmail.com>

* remove fused layer norm and fused softmax and use apex instead

Signed-off-by: ericharper <complex451@gmail.com>

* update imports

Signed-off-by: ericharper <complex451@gmail.com>

* remove comment

Signed-off-by: ericharper <complex451@gmail.com>

* use apex enums

Signed-off-by: ericharper <complex451@gmail.com>

* use apex enums

Signed-off-by: ericharper <complex451@gmail.com>

* add tab

Signed-off-by: ericharper <complex451@gmail.com>

* Timer with sliding window (#3002)

Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>

* revert tab

Signed-off-by: ericharper <complex451@gmail.com>

* check for rank zero

Signed-off-by: ericharper <complex451@gmail.com>

* check for rank zero

Signed-off-by: ericharper <complex451@gmail.com>

* try explicit log dir

Signed-off-by: ericharper <complex451@gmail.com>

* add +

Signed-off-by: ericharper <complex451@gmail.com>

* don't rm

Signed-off-by: ericharper <complex451@gmail.com>

* make dir if it doesn't exist

Signed-off-by: ericharper <complex451@gmail.com>

* create mp nemo file in temp directory

Signed-off-by: ericharper <complex451@gmail.com>

* simplify mp save_to

Signed-off-by: ericharper <complex451@gmail.com>

* handle mp 1 case

Signed-off-by: ericharper <complex451@gmail.com>

* style fix

Signed-off-by: ericharper <complex451@gmail.com>

* remove files

Signed-off-by: ericharper <complex451@gmail.com>

* fix consumed_samples when resuming

Signed-off-by: ericharper <complex451@gmail.com>

* fix reinstall.sh

Signed-off-by: ericharper <complex451@gmail.com>

* update req

Signed-off-by: ericharper <complex451@gmail.com>

* add more detailed log for dataloaders

Signed-off-by: ericharper <complex451@gmail.com>

* check if cuda is available before using fused_adam

Signed-off-by: ericharper <complex451@gmail.com>

* revert comment

Signed-off-by: ericharper <complex451@gmail.com>

* update eval script to use model.freeze

Signed-off-by: ericharper <complex451@gmail.com>

* log train loss averaged over gradient accumulation steps

Signed-off-by: ericharper <complex451@gmail.com>

* check copyright earlier

Signed-off-by: ericharper <complex451@gmail.com>

* todo

Signed-off-by: ericharper <complex451@gmail.com>

* override SaveRestoreConnector in NLPModel init

Signed-off-by: ericharper <complex451@gmail.com>

* move to scripts

Signed-off-by: ericharper <complex451@gmail.com>

* remove star import

Signed-off-by: ericharper <complex451@gmail.com>

* remove comments

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused dataset

Signed-off-by: ericharper <complex451@gmail.com>

* removed barrier

Signed-off-by: ericharper <complex451@gmail.com>

* check cfg

Signed-off-by: ericharper <complex451@gmail.com>

* remove logging

Signed-off-by: ericharper <complex451@gmail.com>

* freeze, unfreeze

Signed-off-by: ericharper <complex451@gmail.com>

* return None

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* add TODO

Signed-off-by: ericharper <complex451@gmail.com>

* typecheck

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* todo

Signed-off-by: ericharper <complex451@gmail.com>

* add common native plugin

Signed-off-by: ericharper <complex451@gmail.com>

* restore with trainer

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* deprecate megatron-lm bert

Signed-off-by: ericharper <complex451@gmail.com>

* deprecate megatron-lm bert

Signed-off-by: ericharper <complex451@gmail.com>

* compile helpers ont he fly

Signed-off-by: ericharper <complex451@gmail.com>

* remove amp_level

Signed-off-by: ericharper <complex451@gmail.com>

* remove amp_level from configs

Signed-off-by: ericharper <complex451@gmail.com>

* add missing import

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* remove amp_level

Signed-off-by: ericharper <complex451@gmail.com>

* use fast huggingface tokenizers by default

Signed-off-by: ericharper <complex451@gmail.com>

* deal with huggingface tokenizer positional args

Signed-off-by: ericharper <complex451@gmail.com>

* deal with huggingface tokenizer positional args

Signed-off-by: ericharper <complex451@gmail.com>

* deal with huggingface tokenizer positional args

Signed-off-by: ericharper <complex451@gmail.com>

* revert use_fast default to False

Signed-off-by: ericharper <complex451@gmail.com>

* return super training_epoch_end

Signed-off-by: ericharper <complex451@gmail.com>

* remove optimizer_idx arg from training_step

Signed-off-by: ericharper <complex451@gmail.com>

* remove unused arg from on_train_epoch_end

Signed-off-by: ericharper <complex451@gmail.com>

* add restore_from_path to nemo config

Signed-off-by: ericharper <complex451@gmail.com>

* add comment

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

* override connector if not subclassing NLPSaveRestoreConnector for model parallel save

Signed-off-by: ericharper <complex451@gmail.com>

* update test optimizer

Signed-off-by: ericharper <complex451@gmail.com>

* clean up

Signed-off-by: ericharper <complex451@gmail.com>

* clean up

Signed-off-by: ericharper <complex451@gmail.com>

* clean up

Signed-off-by: ericharper <complex451@gmail.com>

* clean up

Signed-off-by: ericharper <complex451@gmail.com>

* make data_prefix mandatory in config

Signed-off-by: ericharper <complex451@gmail.com>

* update installation instructions on readme

Signed-off-by: ericharper <complex451@gmail.com>

* update dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* add todo

Signed-off-by: ericharper <complex451@gmail.com>

* raise error if trying to use always_save_nemo with model parallel model

Signed-off-by: ericharper <complex451@gmail.com>

* remove comment

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-10-20 21:06:37 -06:00
Micha Livne 3d678dbff1
Nmt encoder decoder hidden size fix (#2856)
* 1. Enabled encoder/decoder with different size in bottleneck architecture.
2. Validating encoder/decoder with the same size in non-bottleneck parent class.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed typo.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Added hidden_size ot error message.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed missing defaults.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixing CI tests to have same hidden_size.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated error message.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updating Jenkins CI test.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updating CI to hidden=48

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed missing hidden_size when loading pre-trained huggingface model.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed missing hidden_size in config for pre-trained models.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed style.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated missng hidden_size in config.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Testing encoder and decoder objects' hidden_size instead of config to support pre-trained models.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated Jenkinsfile test values.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed Jenkinsfile test values (NMT Megatron Model Parallel Size 2 Encoder)

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updating missing arguments for Jenkinsfile test.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-09-28 10:08:24 -06:00
Micha Livne 3f6aee0433
1. Updated Jenkinsfile hidden_size. (#2892)
Signed-off-by: Micha Livne <mlivne@nvidia.com>

Co-authored-by: Micha Livne <mlivne@nvidia.com>
2021-09-24 16:10:19 -06:00
Evelina bb39528f4f
tar dataset for TN/ITN (#2826)
* tar dataset added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* typo and ci test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-16 10:33:56 -07:00
Elena Rastorgueva aced0db13e
ITN Spanish (#2489)
* add Spanish ITN for cardinals and decimals (currently displaces English rules)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Refactor ITN so English and Spanish code is side by side

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN rules for electronic

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN rules for money

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN rules for money

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Apply simple style fixes

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN rules for ordinals

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix 'doscientos' typo

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN rules for telephone numbers

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Apply style fixes

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix bug (NEMO_CHAR was being modified)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN rules for time

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Apply style fixes

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Move ITN utils to language-specific folder

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make separate test script folders for each language

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make Cardinal class not convert numbers less than 10

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN Date rules

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Rename variables in Time rules

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Apply style fixes

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ITN WhiteList rules

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Apply style fixes

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Word test cases

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Change Ordinal 'suffix' to 'morphosyntactic_features'

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish to Sparrowhawk test scripts

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Remove unused imports

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Allow decimals to have a punto as well as a coma

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typos

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Spanish ClassifyFst caching

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix Money class bug

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add Es Cardinal rules up to one septillionn, still ignoring 'y' in cardinals

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix Cardinal bug which inserted extra zeros

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix decimal rules bug

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add more Ordinal cases and don't convert ordinals less thathan 10

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add more units to MeasureFst

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Added currencies to Money class

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make er ending in Ordinals be superscript

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add TimeFst tagger comments

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update headers

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add missing __init__.py file

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* dco fix for Elena's branch

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix Darg name in docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update headers

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Update TelephoneFst tagger docstring

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Make ElectronicFst also convert URLs

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix cardinal bug which converted e.g. ,uno to ,1

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Add cache_dir to CI tests

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Install numba=0.54

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Install numba=0.54.0

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Install numba==0.53.1

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix ru -> es typo in CI tests

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

* Fix typo in CI test

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Yang Zhang <yangzhang@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-09-16 01:01:50 -07:00
Somshubra Majumdar a0dc5b5912
Enforce numba compat (#2823)
* Enforce numba compat

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove all RNNT tests temporarily

Signed-off-by: smajumdar <titu1994@gmail.com>
2021-09-15 14:06:52 -07:00
Somshubra Majumdar a33ec491c2
Temporarily disable numba cuda tests from running (#2820)
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-09-14 13:19:52 -07:00
Yang Zhang f94608ab4d
Tn fix bugs (#2815)
* explicitly set weight to choose deterministic rule, important for SH

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix whitelist test case

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* added more symbols support for itn electronic

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* adding url to itn

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* prevent case where single cardinal, e.g. 4 without suffix is recognized as time

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* date does not accept standalone month anymore

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* style fix

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add decimalx to measure

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* cardinal times

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* cardinal times

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add updated en grammars

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix ci

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* comment out tn with audio tests

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-09-14 10:36:27 -07:00
Evelina fb6b3b83b6
non-deterministic norm update (#2787)
* update script for large files

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* write intermediate result to a file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* file renamed

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* expose n_jobs arg

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* new grammars

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-09-13 18:50:21 -07:00
Somshubra Majumdar 13aef324bb
Update container and Dockerfile (#2799)
* Update container and Dockerfile

Signed-off-by: smajumdar <titu1994@gmail.com>

* Patch dali cpu test

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-09-11 12:18:06 -07:00
Micha Livne 59ff2bc53c
Nmt perceiver fix (#2649)
* revert name change

Signed-off-by: ericharper <complex451@gmail.com>

* 1. Added CI for Perseiver, and Bridge.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Added CI for VAE and MIM.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Reduced CI mdoels size.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Fixed CI to run at most 2 in parallel.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Added missing braces }

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Updated test names.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Debugging.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Debugging.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Debugging.

Signed-off-by: Micha Livne <mlivne@nvidia.com>

* 1. Added CI to Jenkins commented (to be debugged later)

Signed-off-by: Micha Livne <mlivne@nvidia.com>

Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-09-10 17:34:50 -07:00
Evelina 66102b885c
disable_test (#2801)
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-09 16:42:07 -07:00
Nithin Rao 0aa5b4526a
Move speaker folders (#2777)
* initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

change folder

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Create README.md

initial diar readme

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

small_fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Update README.md

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add diarization README

initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Updated README.md 001

Updated README.md and committing for saving purpose

Update README.md

conf changes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Update README.md 002

Added examples for input and output.

Added diarization_utils.py and asr_with_diarization.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

slight changes diarization

oracle null and style --fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Reflected LGTM comments.

Signed-off-by: Taejin Park <tango4j@gmail.com>

reflected changes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

remove duplicate seeds

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Reflected PR review and removed unused variables

Signed-off-by: Taejin Park <tango4j@gmail.com>

Update README.md 003

Added a few titles and revised the descriptions.

Update README.md 003

Added a few titles and revised the descriptions.

Signed-off-by: Taejin Park <tango4j@gmail.com>

scripts and tutorial link fixes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

LGTM fixes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Added more docstrings and reused get_DER

Signed-off-by: Taejin Park <tango4j@gmail.com>

style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update ecapa config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
2021-09-08 20:58:08 -07:00
Evelina 65bcb68640
fix for HF model restoration (#2772)
* fix for restore

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins path  update

Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-03 16:18:56 -06:00
Eric Harper 234e496fcf
Merge final bugfix r1.3.0 (#2749)
* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update notebooks branch

Signed-off-by: ericharper <complex451@gmail.com>

* Replaced unfold() with split_view() (#2671)

* Replaced unfold() with split_view()

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* fixed typo

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix issues with ASR notebooks (#2698)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Allow non divisible split_size (#2699)

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* Fix the feat_out param. (#2714)

* broken link fix (#2720)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* rename (#2721)

Signed-off-by: fayejf <fayejf07@gmail.com>

* apply fix (#2726)

Signed-off-by: Jason <jasoli@nvidia.com>

* [DOCS] Updating adobe and copyright for docs (#2740)

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update notebook branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins test to use less memory

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins test to use less memory

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
2021-08-31 11:52:03 -06:00
Eric Harper 2ff89fdf56
Merge 1.3 bugfixes into main (#2715)
* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update notebooks branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update nemo version for Dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* update notebook branch

Signed-off-by: ericharper <complex451@gmail.com>

* Update colab links to Transducer notebooks (#2654)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix nmt grpc server, concatdataset for raw text files (#2656)

* Fix nmt grpc server and concatdataset for raw text files

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Check if lang direction is provided correctly

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* add missing init (#2662)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix qa inference for single example (#2668)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix max symbol per step updating for RNNT (#2672)

* Fix max symbol per step updating for RNNT

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix notebooks

Signed-off-by: smajumdar <titu1994@gmail.com>

* Replaced unfold() with split_view() (#2671)

* Replaced unfold() with split_view()

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* fixed typo

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* Correct voice app demo (#2682)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Import guard (#2692)

* add asr and pynini import guard

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove asrmodel type

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove asrmodel type

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fixing branch (#2695)

Signed-off-by: Ghasem Pasandi <gpasandi@nvidia.com>

Co-authored-by: Ghasem Pasandi <gpasandi@nvidia.com>

* fix for emojis (#2675)

* fix for emojis

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove redundant line

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* raise error

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* use app_state

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Fix issues with ASR notebooks (#2698)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Allow non divisible split_size (#2699)

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* TN fix for corner cases (#2689)

* serial added, weights to common defaults, decimal bug fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* one failing

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* all tests pass

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove redundant file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix telephone, add test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* money fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix edge case of greedy decoding for greedy_batch mode (#2701)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove time macro (#2703)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Minor FastPitch Fixes (#2697)

* fixes

Signed-off-by: Jason <jasoli@nvidia.com>

* update CI

Signed-off-by: Jason <jasoli@nvidia.com>

* refix

Signed-off-by: Jason <jasoli@nvidia.com>

* Fix ddp error. (#2678)

To avoid "MisconfigurationException: Selected distributed backend ddp is not compatible with an interactive environment." error.

Co-authored-by: ekmb <ebakhturina@nvidia.com>

* update jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* update notebooks

Signed-off-by: ericharper <complex451@gmail.com>

* add split_view back

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Ghasem <35242805+pasandi20@users.noreply.github.com>
Co-authored-by: Ghasem Pasandi <gpasandi@nvidia.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-08-24 16:21:59 -06:00
Jason b10bb28c51
Add Small Test (#2711)
* tests

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>
2021-08-24 10:38:38 -04:00
Jason 4f2ea4913c
Refactor and Minimize Dependencies (#2643)
* squash

Signed-off-by: Jason <jasoli@nvidia.com>

* add comments

Signed-off-by: Jason <jasoli@nvidia.com>

* style and cleanup

Signed-off-by: Jason <jasoli@nvidia.com>

* cleanup

Signed-off-by: Jason <jasoli@nvidia.com>

* add new test file

Signed-off-by: Jason <jasoli@nvidia.com>

* syntax

Signed-off-by: Jason <jasoli@nvidia.com>

* style

Signed-off-by: Jason <jasoli@nvidia.com>

* typo

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>

* try again

Signed-off-by: Jason <jasoli@nvidia.com>

* wip

Signed-off-by: Jason <jasoli@nvidia.com>

* style; ci should fail

Signed-off-by: Jason <jasoli@nvidia.com>

* final

Signed-off-by: Jason <jasoli@nvidia.com>
2021-08-17 10:55:43 -04:00
Evelina 36286d04f2
ITN Ru and non-deterministic TN (#2519)
* fix for large cardinals, refactor to use rewrite

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix incorrect test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* ru itn + audio updates wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* wip refactor

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* wip refactor

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* subfolder

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add alternative for one thousand

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add RU TN to audio based

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* test separate TN RU class

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* itn/tn card-or-dec

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* decimal itn update, works

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* tn measure

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* date, electronic, ru-> latin map

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* tn ru electronic

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* move all logic to tagger

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* itn date and electronic update-fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* money class

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* test update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* money update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* money complete

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* merge with main

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* merge with main

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* merge with main

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* merge conflict

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* merge conflict resolved

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* header

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* measure update itn

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* before telephone

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* telephone added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* time added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* date sh

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* date sh pass

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix measure and money for sh

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix sh telephone

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* docstrings

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* delete separate tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* temp time files

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* time wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* time itn fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* adding digit normalization to date

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* all tests pass

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* headers fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up, year corner case added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add whitelist

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* measurement.tsv update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove redundant files

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* files moved, lgtm imports

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* docstrings

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* commenting out ru_normalization_tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* review

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* disable non-deter ru text_norm tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* review

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* enable itn ci tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* enable itn ci tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* enable itn ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* enable cache for ru grammars

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add cache to all languages

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* message update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* enable TN/ITN ci tests for *tn* branches

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix import

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* review

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* word correction

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* test case update, header

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* disable itn tests for main

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* lgtm errors

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* enable itn tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove measure from whitelist conversion, enable itn tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update jenkins branch pattern

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update jenkins branch pattern

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update jenkins branch pattern, CPU tests are off

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update jenkins branch pattern, CPU tests are off

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert to main for all tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert to main for all tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert to main for all tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* temp

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add .fst files to setup

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add missing init

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* test ci time

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert uncommented tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>
2021-08-12 22:00:35 -07:00
Eric Harper 7d6ceeb726
revert name change (#2647)
Signed-off-by: ericharper <complex451@gmail.com>
2021-08-12 14:21:37 -06:00
Jason 1a57cec1dd
Update TTS CI Tests (#2639)
* update ci tests

Signed-off-by: Jason <jasoli@nvidia.com>

* typos

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>
2021-08-12 09:34:40 -07:00
Evelina 91ac90bb46
TN update (#2612)
* extend tn grammars for nn

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* debug money

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix measure/date

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* url update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* url update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* test remove dash

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* test format fixed

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove unrelated minor currencies

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* money works

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove name tag

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix some tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* default decimal format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* wip money test fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* debug tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* cardinal default value fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* reorder date in tagger

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* money fix corner cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* additional date format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* wip

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* all test pass

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* all days added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add .far file to git

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* set cache default to True

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* move generator() to utils

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* enable itn tests on ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* reload .far file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* review, remove .far, update jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove whitelist words from abbreviation

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update test based on abbreviation class changes

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* create .far files first and then run tests, add cache_dir arg for all tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* style

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* debug ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add missing __init__ to German itn

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove unused import

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* tests fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* restart ci

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update folder names

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
2021-08-11 21:13:39 -07:00
Somshubra Majumdar f092c7f656
Make ITN tests optional (run only on change) (#2611)
* Make ITN tests optional run only on change

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert Jenkinsfile for RNNT test

Signed-off-by: smajumdar <titu1994@gmail.com>
2021-08-04 10:48:33 -07:00
Somshubra Majumdar 7051487c7e
Update contextnet configs (#2601)
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-08-03 15:01:04 -07:00
Somshubra Majumdar d04c7e9b4e
Integrate NVIDIA DALI 1.4 to NeMo ASR (#2567)
* Initial prototype of ASR DALI integration with DALI 1.4

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update dali support to 1.4

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix docs

Signed-off-by: smajumdar <titu1994@gmail.com>

* Address comments

Signed-off-by: smajumdar <titu1994@gmail.com>

* Apply suggestions from code review

Co-authored-by: Janusz Lisiecki <39967756+JanuszL@users.noreply.github.com>

* Address comments

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct module utils

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Janusz Lisiecki <39967756+JanuszL@users.noreply.github.com>
2021-08-03 11:01:51 -07:00
Eric Harper c527e954b6
Update container version to 21.06 (#2431)
* update container version

Signed-off-by: ericharper <complex451@gmail.com>

* remove conda update from reinstall.sh

Signed-off-by: ericharper <complex451@gmail.com>

* pin numba in reinstall

Signed-off-by: ericharper <complex451@gmail.com>
2021-07-16 11:51:53 -07:00
Jason d445f44dfd
Add copyright headers check (#2490)
* add cpr hdrs

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>

* review

Signed-off-by: Jason <jasoli@nvidia.com>
2021-07-15 15:06:45 -04:00
Yang Zhang ed085459c9
refactor text processing ONly code to allow other languages (#2477)
* refactor text processing ONly code to allow other languages

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* refactored test folder structure to divide between languages

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated docs

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add missing file

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix lgtm

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-07-14 08:17:22 -07:00
Yang Zhang ca0918b053
remove failing tests (#2478)
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-07-13 18:00:46 -07:00
Eric Harper c5dbf4508a
Merge r1.1 bugfixes to main. Update dep versions. (#2437)
* Update notebook branch and Jenkinsfile for 1.1.0 testing (#2378)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (#2380)

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* instantiate with NLPDDPPlugin with num_nodes from trainer config

Signed-off-by: ericharper <complex451@gmail.com>

* Update ASR scripts for tokenizer building and tarred dataset building (#2381)

* Update ASR scripts for tokenizer building and tarred dataset building

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update container

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add STT Zh Citrinet 1024 Gamma 0.25 model

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update notebook (#2391)

Signed-off-by: smajumdar <titu1994@gmail.com>

* ASR Notebooks fix for 1.1.0 (#2395)

* nb fix for spring clean

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove outdated instruction

Signed-off-by: fayejf <fayejf07@gmail.com>

* Mean normalization (#2397)

* norm embeddings

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* move to utils

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Bugfix adaptive spec augment time masking (#2398)

* bugfix adaptive spec augment

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove static time width clamping

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct typos and issues with notebooks (#2402)

* Fix Primer notebook

Signed-off-by: smajumdar <titu1994@gmail.com>

* Typo

Signed-off-by: smajumdar <titu1994@gmail.com>

* remove accelerator=DDP in tutorial notebooks to avoid errors. (#2403)

Signed-off-by: Hoo Chang Shin <hshin@nvidia.com>

Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>

* [BUGFIX] Megatron in NMT was setting vocab_file to None (#2417)

* make vocab_file configurable for megatron in nmt

Signed-off-by: ericharper <complex451@gmail.com>

* update docs

Signed-off-by: ericharper <complex451@gmail.com>

* update docs

Signed-off-by: ericharper <complex451@gmail.com>

* Link updates in docs and notebooks and typo fix (#2416)

* typo fix for notebooks

Signed-off-by: fayejf <fayejf07@gmail.com>

* tiny typo fix in docs

Signed-off-by: fayejf <fayejf07@gmail.com>

* docs branch->stable

Signed-off-by: fayejf <fayejf07@gmail.com>

* more docs branch -> stable

Signed-off-by: fayejf <fayejf07@gmail.com>

* tutorial links branch -> stable

Signed-off-by: fayejf <fayejf07@gmail.com>

* small fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* add renamed 06

Signed-off-by: fayejf <fayejf07@gmail.com>

* more fixes

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update onnx (#2420)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct version of onnxruntime (#2422)

Signed-off-by: smajumdar <titu1994@gmail.com>

* update deployment instructions (#2430)

Signed-off-by: ericharper <complex451@gmail.com>

* Bumping version to 1.1.0

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* update jenksinfile

Signed-off-by: ericharper <complex451@gmail.com>

* add upper bounds

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update requirements

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update version

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-07-02 14:22:44 -07:00
Eric Harper 5fcfa9e7bf
Merge r1.1 bugfixes into main (#2407)
* Update notebook branch and Jenkinsfile for 1.1.0 testing (#2378)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (#2380)

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* instantiate with NLPDDPPlugin with num_nodes from trainer config

Signed-off-by: ericharper <complex451@gmail.com>

* Update ASR scripts for tokenizer building and tarred dataset building (#2381)

* Update ASR scripts for tokenizer building and tarred dataset building

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update container

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add STT Zh Citrinet 1024 Gamma 0.25 model

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update notebook (#2391)

Signed-off-by: smajumdar <titu1994@gmail.com>

* ASR Notebooks fix for 1.1.0 (#2395)

* nb fix for spring clean

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove outdated instruction

Signed-off-by: fayejf <fayejf07@gmail.com>

* Mean normalization (#2397)

* norm embeddings

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* move to utils

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Bugfix adaptive spec augment time masking (#2398)

* bugfix adaptive spec augment

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove static time width clamping

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct typos and issues with notebooks (#2402)

* Fix Primer notebook

Signed-off-by: smajumdar <titu1994@gmail.com>

* Typo

Signed-off-by: smajumdar <titu1994@gmail.com>

* remove accelerator=DDP in tutorial notebooks to avoid errors. (#2403)

Signed-off-by: Hoo Chang Shin <hshin@nvidia.com>

Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update notebook branch to main

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>
2021-06-25 12:04:57 -06:00
Eric Harper 5cfff20988
[NMT] Model Parallel Megatron Encoders (#2238)
* add megatron encoder

Signed-off-by: ericharper <complex451@gmail.com>

* added megatron to get_nmt_tokenizer

Signed-off-by: ericharper <complex451@gmail.com>

* add vocab_size and hidden_size to megatron bert

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron encoder module

Signed-off-by: ericharper <complex451@gmail.com>

* fixed horrible typo

Signed-off-by: ericharper <complex451@gmail.com>

* fix typo and add default

Signed-off-by: ericharper <complex451@gmail.com>

* updating nlp overrides for mp nmt

Signed-off-by: ericharper <complex451@gmail.com>

* move some logic back to nlpmodel from overrides

Signed-off-by: ericharper <complex451@gmail.com>

* add checkpoint_file property

Signed-off-by: ericharper <complex451@gmail.com>

* fix property

Signed-off-by: ericharper <complex451@gmail.com>

* num_tokentypes=0

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* find_unused_parameters=True

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* get instead of pop

Signed-off-by: ericharper <complex451@gmail.com>

* remove token type ids from megatron input example

Signed-off-by: ericharper <complex451@gmail.com>

* pop vocab_size

Signed-off-by: ericharper <complex451@gmail.com>

* fix checkpointing for model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* fix bug in non model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* convert cfg.trainer to dict

Signed-off-by: ericharper <complex451@gmail.com>

* make num_tokentypes configurable for nmt

Signed-off-by: ericharper <complex451@gmail.com>

* update checkpoint_file when using named megatron model in nemo

Signed-off-by: ericharper <complex451@gmail.com>

* make vocab_file configurable

Signed-off-by: ericharper <complex451@gmail.com>

* dataclass can't have mutable default

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* revert input example

Signed-off-by: ericharper <complex451@gmail.com>

* check that checkpoint version is not None

Signed-off-by: ericharper <complex451@gmail.com>

* add mp jenkins test

Signed-off-by: ericharper <complex451@gmail.com>

* update docstring

Signed-off-by: ericharper <complex451@gmail.com>

* add docs for pretrained encoders with nemo nmt

Signed-off-by: ericharper <complex451@gmail.com>
2021-06-16 20:32:33 -06:00
Somshubra Majumdar 3e94696e21
Update container version to 21.05 (#2309)
* Update container version

Signed-off-by: smajumdar <titu1994@gmail.com>

* Temporarily change export format of waveglow

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add conda update for numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct order of numba minimum verion, remove wrong flag from test

Signed-off-by: smajumdar <titu1994@gmail.com>

* Double test of cuda numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Double test of cuda numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Enable RNNT tests

Signed-off-by: smajumdar <titu1994@gmail.com>
2021-06-14 17:39:45 -06:00
Eric Harper 880455a267
update out_dir to not collide (#2358)
Signed-off-by: ericharper <complex451@gmail.com>
2021-06-14 16:39:01 -06:00
Evelina 617e856965
Audio Norm (#2285)
* add jenkins test, refactoring

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix new test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* manifest test added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* expose more params, new test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* addressed review comments

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix decimal in measure

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* move serial in cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update for SH zero -> oh

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* change n_tagger default

Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-06-08 15:08:35 -07:00
Vitaly Lavrukhin f0ba4e1289
Added JSON manifest's support to transcribe_speech.py (#2304)
* Added JSON manifest's support to transcribe_speech.py

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>

* Dropped unused import

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
2021-06-04 08:39:38 -07:00
Oleksii Kuchaiev 4d4f3ebfb8 Merge tag 'v1.0.0' into main
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-06-03 15:49:50 -07:00
Joaquin Anton 011b2d6be3
Fix and enable DALI tests (#2077)
* Fix and enable DALI tests

Signed-off-by: Joaquin Anton <janton@nvidia.com>

* remove unused import

Signed-off-by: Joaquin Anton <janton@nvidia.com>

* Move DALI tests to a separate Jenkins stage

Signed-off-by: Joaquin Anton <janton@nvidia.com>

* Remove DALI tests from the main jenkins ASR stage

Signed-off-by: Joaquin Anton <janton@nvidia.com>

* Comment out MFCC test

Signed-off-by: Joaquin Anton <janton@nvidia.com>

* Working version

Signed-off-by: Joaquin Anton <janton@nvidia.com>
2021-06-01 09:07:47 -07:00
Jason 79d1dea84b
TTS Doc Fix and Remove TTS Test (#2272)
* bug fix and remove test

Signed-off-by: Jason <jasoli@nvidia.com>

* syntax

Signed-off-by: Jason <jasoli@nvidia.com>

* syntax

Signed-off-by: Jason <jasoli@nvidia.com>

* syntax

Signed-off-by: Jason <jasoli@nvidia.com>
2021-05-27 14:46:31 -04:00
Oleksii Kuchaiev 3d92dcf5ec Merge branch 'v1.0.0' into main 2021-05-12 14:14:20 -07:00
Sandeep Subramanian d84e31790e
Add a CI test for doing inference with an NMT model trained with Pre-LN (#2198)
* Change label smoothing prob to reduce chance of test failure

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add Pre-LN inference test to Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Separate tests for training and NMT inference

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-05-11 22:38:41 -07:00
Oleksii Kuchaiev d0331fd520 Merge branch 'v1.0.0' into main
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-05-10 10:47:17 -07:00
Evelina 6ad313d4f3
token classification models artifacts update (#2169)
* artifacts update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* artifacts update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix for model restoration

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* typos fix + jenkins dir update

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* add &&

Signed-off-by: ericharper <complex451@gmail.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins disable

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins disable

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: ericharper <complex451@gmail.com>
2021-05-07 10:03:33 -06:00