Samuel Kriman
b7a175b7b9
Self-supervised pre-training for speech models ( #3139 )
...
* self-supervised training
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* test
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* remove imports
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* fix
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* sort imports
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* fix audio_to_text
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* manifest handle no text
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* loss init
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* style
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* remove tokenizer from config
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* config changes
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* remove hydra import
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* always spec augment
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* fixes
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* copyright
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* fix cosine sim
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* fix cosine sim
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* fix cosine sim
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* changes based on comments
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* changes based on comments
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* configs
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* name fix
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* ci config changes
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* renamed to num_negatives
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* minor changes
Signed-off-by: sam1373 <samuelkriman@gmail.com>
* name changes, type annotations
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
2021-11-10 15:33:11 -08:00
bene-ges
9ab22a40bd
fixed two typos ( #3157 )
...
Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
2021-11-10 07:41:39 -08:00
Vahid Noroozi
cfcf694e30
Adding parallel transcribe for ASR models - suppports multi-gpu/multi-node ( #3017 )
...
* added transcribe_speech_parallel.py.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed style.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* removed comments.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed bug.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed bug.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added comments inside the script.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed speed_collate_fn for TTS.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed speed_collate_fn for TTS.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed speed_collate_fn for TTS.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed speed_collate_fn for TTS.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added return_sample_id.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added return_sample_id.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* merged dataset configs.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* merged dataset configs.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* dropped sample_ids from train and validation.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* dropped sample_ids from train and validation.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* reverted tts patches.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* reverted tts patches.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed the default values in the dataset's config
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed the default values in the dataset's config
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Fixed the bug for optional outputs.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Fixed the bug for optional outputs.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* addressed some comments.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* disabled dali support for return_sample_id.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* disabled dali support for return_sample_id.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* converted the config to omegaconf.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* converted the config to omegaconf.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* converted the config to omegaconf.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* moved wer/cer calculation to the end.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* moved wer/cer calculation to the end.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* moved wer/cer calculation to the end.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* moved wer/cer calculation to the end.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* moved wer/cer calculation to the end.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* calculates global wer instead of per sample.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* calculates global wer instead of per sample.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* calculates global wer instead of per sample.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
2021-11-10 00:37:19 -08:00
Satpal Singh Rathore
b12ac8ae85
Typo correction in README.rst ( #3103 )
...
Signed-off-by: Satpal Singh Rathore <satpalsinghrathore001@gmail.com>
2021-11-09 21:23:38 -08:00
tbartley94
1106ff93c0
WFST_tutorial for ITN development ( #3128 )
...
* Pushing WFST_tutorial for open draft. (Still need to review collab code.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Checked tutorial code for WFST_Tutorial is properly functioning. Also included some formatting edits.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Responding to editorial comments for WFST_tutorial
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Added images to folder and wrote README for tutorials
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Few more editorial changes to explain permutations in classification.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Updated tutorials documentation page.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Forgot links for README
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* TOC links were dead
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* More dead links to fix.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* removing collab install and appending a warning instead.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Update WFST_Tutorial.ipynb
Signed-off-by: tbartley94 <tbartley@nvidia.com>
2021-11-09 12:18:19 -08:00
Nithin Rao
dc9ed88f78
Modify speaker input ( #3100 )
...
* initial_commit
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* init diarizer
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* vad+speaker
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* vad update
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* speaker done
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* initial working version
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* compare outputs
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* added uem support
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* pyannote improvements
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* updated config and script name
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* style fix
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update Jenkins file
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* jenkins fix
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* jenkins fix
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update file path in jenkins
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update file path in jenkins
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update file path in jenkins
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* jenkins quote fix
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update offline speaker diarization notebook
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* intial working asr_with_diarization
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* almost done, revist scoring part
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* fixed eval in offline diarization with asr
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update write2manifest to consider only up to max audio duration
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* asr with diarization notebook
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* Fixed ASR_with_diarization tutorial.ipynb and diarization_utils and edited config yaml file
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Fixed VAD parameters in Speaker_Diarization_Inference.ipynb
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Added Jenkins test, doc strings and updated README
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update jenkins test
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* Doc info in offline_diarization_with_asr
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* Review comments
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update outdir paths
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
Co-authored-by: Taejin Park <tango4j@gmail.com>
2021-11-06 10:55:32 -04:00
bugface
9ec02280d0
a quick fix for issue #3094 index out-of-bound when truncating long text to max_seq_length ( #3131 )
2021-11-05 18:50:10 -07:00
Yang Zhang
875f54464a
update english tn ckpt ( #3143 )
...
* update english tn ckpt
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove ununsed import
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-11-05 11:24:47 -07:00
Somshubra Majumdar
a6834f8b6c
Add logging to LS script ( #3141 )
...
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-11-04 16:03:58 -07:00
Yang Zhang
3fe7308a37
Tn add nn wfst and doc ( #3135 )
...
* made tagger exportable
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* added whitelist wfst for nn
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* updated documentation
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove experimental
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* updated doc
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* made tagger exportable
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* added whitelist wfst for nn
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* updated documentation
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove experimental
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* updated doc
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* preserve punct after nn wfst
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-11-04 11:48:26 -07:00
Eric Harper
aaacc4b089
Merge r1.5.0 bugfixes and doc updates to main ( #3133 )
...
* update branch
Signed-off-by: ericharper <complex451@gmail.com>
* Always save last checkpoint on train end even if folder does not exist (#2976 )
* add fix for no checkpoint folder when training ends
Signed-off-by: Jason <jasoli@nvidia.com>
* update
Signed-off-by: Jason <jasoli@nvidia.com>
* fix test
Signed-off-by: Jason <jasoli@nvidia.com>
* fixes
Signed-off-by: Jason <jasoli@nvidia.com>
* typo
Signed-off-by: Jason <jasoli@nvidia.com>
* change check
Signed-off-by: Jason <jasoli@nvidia.com>
* [NLP] Add Apex import guard (#3041 )
* add apex import guard
Signed-off-by: ericharper <complex451@gmail.com>
* add apex import guard
Signed-off-by: ericharper <complex451@gmail.com>
* add apex import guard
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* remove from init add logging to constructor
Signed-off-by: ericharper <complex451@gmail.com>
* remove from init add logging to constructor
Signed-off-by: ericharper <complex451@gmail.com>
* remove import from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert encoder logic from NLPModel
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* Exp manager small refactor (#3067 )
* Exp manager small refactor
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* move super() call earlier in the function
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
* Change container (#3087 )
Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
* Training of machine translation model fails if config parameter `trainer.max_epochs` is used instead of `trainer.max_steps`. (#3112 )
* fix: replace distributed_backend for accelarator
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add debug script
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Remove debug script
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* update (#3113 )
Signed-off-by: Jason <jasoli@nvidia.com>
* Fix: punctuation capitalization inference on short queries (#3111 )
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Eric Harper <complex451@gmail.com>
* Multiple ASR Fixes to SPE tokenization (#3119 )
* Reduce num workers for transcribe
Signed-off-by: smajumdar <titu1994@gmail.com>
* Fix SPE tokenizer vocabulary construction
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update tokenizer building script
Signed-off-by: smajumdar <titu1994@gmail.com>
* Remove logs
Signed-off-by: smajumdar <titu1994@gmail.com>
* Megatron GPT training in BCP (#3095 )
* BCP megatron training
Signed-off-by: madhukar <madhukar@penguin>
* Add quotes
Signed-off-by: madhukar <madhukar@penguin>
* Style fix
Signed-off-by: madhukar <madhukar@penguin>
Co-authored-by: madhukar <madhukar@penguin>
* Upgrade to PTL 1.5.0 (#3127 )
* update for ptl 1.5.0
Signed-off-by: ericharper <complex451@gmail.com>
* update trainer config
Signed-off-by: ericharper <complex451@gmail.com>
* limit cuda visible devices to the first two gpus on check for ranks CI test
Signed-off-by: ericharper <complex451@gmail.com>
* remove comments
Signed-off-by: ericharper <complex451@gmail.com>
* make datasets larger for test
Signed-off-by: ericharper <complex451@gmail.com>
* make datasets larger for test
Signed-off-by: ericharper <complex451@gmail.com>
* update compute_max_steps
Signed-off-by: ericharper <complex451@gmail.com>
* update compute_max_steps
Signed-off-by: ericharper <complex451@gmail.com>
* update package info
Signed-off-by: ericharper <complex451@gmail.com>
* remove duplicate code
Signed-off-by: ericharper <complex451@gmail.com>
* remove comment
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Madhukar K <26607911+madhukarkm@users.noreply.github.com>
Co-authored-by: madhukar <madhukar@penguin>
2021-11-04 10:26:58 -06:00
Yang Zhang
663c76a972
Tn clean upsample ( #3024 )
...
* init
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* renamed file
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* adding all cleaning scripts
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* skip sentence if error
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove I-SAME
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix tyle
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove I the first from training
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove DM and Da from upsampling
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove I -> one/first, also add space around dash for alphanumerical context, remove rare currency from being upsampled
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove dalton and DM from being verbalized
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove Da and DM sentences competely
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* addressed review feedback, added data folder in examples
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* refactored code, added data utils functions
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix lgtm
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix lgtm
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* added electronic wfst for english neural TN
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* header and lgtm
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-11-02 10:26:17 -07:00
Oktai Tatanov
d5bd93675b
add export methods for mixer-tts ( #3082 )
...
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
2021-11-01 18:56:14 +03:00
PeganovAnton
25c61f2e6f
Attribute batch_dim_index
is not working in WER
. ( #3099 )
...
* Use batch_dim_index attribute
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add debug print
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix batch_dim_index bug in ctc_decoder
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add test for CTC decoder
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Simplify testing
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Simplify testing
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add debug print
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix missing transpose
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Make batch_dim_index fixes in WER BPE
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Make batch_dim_index fixes in WER BPE
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add tests for WER BPE
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add tests for WER BP
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix batch_dim_index bug in rnnt_wer and add tests
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix batch_dim_index bug in rnnt_wer and add tests
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* fix: add missing return value to mock
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* fix: add missing return value to mock
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add debug prints
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add missing mock method
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
2021-10-31 10:06:58 -07:00
Eric Harper
574b1014fd
Merge r1.5.0 bugfixes and doc updates to main ( #3093 )
...
* update branch
Signed-off-by: ericharper <complex451@gmail.com>
* Fix quantization bug in Asr (#3062 )
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update reinstall and cherry-pick bignlp commits (#3065 )
* add ptl install to reinstall and update jenkins install
Signed-off-by: ericharper <complex451@gmail.com>
* Add a stateless timer to specify max_time per run instead of global m… (#3056 )
* Add a stateless timer to specify max_time per run instead of global max_time across runs
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* (1) reduce the validation loss within a epoch, (2) convert global-batch-based iteartion counts to micro-batch-based (#3055 )
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
* Timer class monitors total time (train + validation + testing) to monitor when to end training (#3061 )
* Check total time in train/validation to exit
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
* Add PUBLICATIONS.md (#3051 )
* Add PUBLICATIONS.md
Signed-off-by: smajumdar <titu1994@gmail.com>
* Add NLP
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update PUBLICATIONS.md
* Update PUBLICATIONS.md
* Fix links
Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
* fix uninstall
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
* fix File Load Error (#3069 )
Signed-off-by: fayejf <fayejf07@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
* Update hyper parameter saving (#3058 )
Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
* Exp manager small refactor (#3067 )
* Exp manager small refactor
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* move super() call earlier in the function
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
* Fix FastPitch Pitch Duration Notebook (#3068 )
* bugfix
Signed-off-by: Jason <jasoli@nvidia.com>
* bugfix2
Signed-off-by: Jason <jasoli@nvidia.com>
* better check
Signed-off-by: Jason <jasoli@nvidia.com>
* confusionmatrix (#3085 )
Signed-off-by: fayejf <fayejf07@gmail.com>
* typo and fix link (#3086 )
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* inf cross-checking across tensor-parallel ranks (#3088 )
* inf cross-checking across tensor-parallel ranks
* sylte
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
* Fix save top k (#3075 )
* inject mp_rank for checkpoint paths in NLPDDPPlugin
Signed-off-by: ericharper <complex451@gmail.com>
* == instead of i
Signed-off-by: ericharper <complex451@gmail.com>
* when checking previous run account for mp
Signed-off-by: ericharper <complex451@gmail.com>
* uninject mp ranks when needed
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* update branch
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
2021-10-29 20:15:37 -06:00
Oktai Tatanov
648c97f076
add sequence and t0 -> t9 to axes.py ( #3090 )
...
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
2021-10-29 08:43:27 -07:00
Oktai Tatanov
9d798cf371
Move vocabs from asr to common ( #3084 )
...
* move vocabs to common
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix style
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* move vocab import to the function scope
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
2021-10-29 12:06:10 +03:00
Yang Zhang
42b167ee2e
Hg cache ( #3080 )
...
* add cache for huggingface
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* change cache location
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
2021-10-28 18:04:45 -06:00
Micha Livne
51253e20be
Fixes bugs in collect_tokenizer_dataset_stats.py ( #3060 )
...
* 1. Fixed a bug when reading small files.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added support for YTTM that cannot be pickled.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Removed unused import.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
2021-10-28 16:04:06 -04:00
Abhinav Khattar
ece44ff0c6
transformer decoder name fix ( #3066 )
...
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
2021-10-27 21:43:15 -04:00
Eric Harper
1c2c268db1
fix readme ( #3070 )
...
Signed-off-by: ericharper <complex451@gmail.com>
2021-10-27 16:23:24 -06:00
Eric Harper
8aac7d9645
Merge r1.5.0 bugfixes and doc updates to main ( #3048 )
...
* update branch
Signed-off-by: ericharper <complex451@gmail.com>
* Always save last checkpoint on train end even if folder does not exist (#2976 )
* add fix for no checkpoint folder when training ends
Signed-off-by: Jason <jasoli@nvidia.com>
* update
Signed-off-by: Jason <jasoli@nvidia.com>
* fix test
Signed-off-by: Jason <jasoli@nvidia.com>
* fixes
Signed-off-by: Jason <jasoli@nvidia.com>
* typo
Signed-off-by: Jason <jasoli@nvidia.com>
* change check
Signed-off-by: Jason <jasoli@nvidia.com>
* [NLP] Add Apex import guard (#3041 )
* add apex import guard
Signed-off-by: ericharper <complex451@gmail.com>
* add apex import guard
Signed-off-by: ericharper <complex451@gmail.com>
* add apex import guard
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* remove from init add logging to constructor
Signed-off-by: ericharper <complex451@gmail.com>
* remove from init add logging to constructor
Signed-off-by: ericharper <complex451@gmail.com>
* remove import from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert encoder logic from NLPModel
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* remove megatron bert from init
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* update branch
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
2021-10-27 12:13:29 -06:00
Somshubra Majumdar
f8d8d069e5
Add PUBLICATIONS.md ( #3051 )
...
* Add PUBLICATIONS.md
Signed-off-by: smajumdar <titu1994@gmail.com>
* Add NLP
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update PUBLICATIONS.md
* Update PUBLICATIONS.md
* Fix links
Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
2021-10-27 09:48:16 -07:00
PeganovAnton
d22cf7643c
Add new CharTokenizer
( #2963 )
...
* Add new CharTokenizer
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
2021-10-27 18:14:45 +03:00
Sandeep Subramanian
eb9117dcd7
Timer class monitors total time (train + validation + testing) to monitor when to end training ( #3061 )
...
* Check total time in train/validation to exit
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-10-26 17:55:06 -07:00
Dong Hyuk Chang
a1c15e72fc
Update nltk version with a CVE fix ( #3054 )
...
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-10-26 17:03:08 -07:00
Sangkug Lym
e3ccd0a90d
(1) reduce the validation loss within a epoch, (2) convert global-batch-based iteartion counts to micro-batch-based ( #3055 )
...
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-10-26 11:05:27 -07:00
Sandeep Subramanian
0d0de9b367
Add a stateless timer to specify max_time per run instead of global m… ( #3056 )
...
* Add a stateless timer to specify max_time per run instead of global max_time across runs
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
2021-10-26 09:05:46 -07:00
Micha Livne
e05712fea2
1. Fixed wrong tgt_length for timing ( #3050 )
...
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
2021-10-25 18:52:20 -04:00
Oktai Tatanov
2e5e4d7613
MixerTTS, MixerTTSDataset and small updates in tts tokenizers ( #2859 )
...
* new vocabs and g2ps for tts
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix style
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts torch data
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update g2p modules, data and add example for tts vocabs
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix style
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add tokens field to tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add TTSDataset and docs for all of them
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix paths in yaml
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update test for tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add heteronyms-030921 file to scripts folder
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* change requirements_torch_tts.txt
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add tts_data_types.py
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix style tts_data_types.py
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update yaml and comments
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add mixer tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add example.py to tts torch
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update helpers.py
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add mixer tts model and ds
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update mixer tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tokenizers
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts tokenizer
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts dataset and mixer tts model
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts_dataset.yaml
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add copyright header to mixer_tts file
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update import in tts/torch/data.py
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add fastpitch in mixer tts code
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add raw version of benchmark
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* remove without matching mode
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* remove unnecessary flags in model
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* refactor mixer tts
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* remove unnecessary blocks in mixer tts model
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts_tokenizers.py
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add mixer tts x config and run script
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix elif and unnecessary file
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add types for mixer tts methods
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
2021-10-25 17:50:48 +03:00
Somshubra Majumdar
2439704860
Enable stateful decoding of RNNT over multiple transcribe calls ( #3037 )
...
* Start on stateful external decoding
Signed-off-by: smajumdar <titu1994@gmail.com>
* Prepare connectors
Signed-off-by: smajumdar <titu1994@gmail.com>
* Refactor greedy sample decoding to use Hypothesis
Signed-off-by: smajumdar <titu1994@gmail.com>
* Refactor greedy batch first mode for Hypothesis
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update second case of greedy batch decoding
Signed-off-by: smajumdar <titu1994@gmail.com>
* Start stateful decoding
Signed-off-by: smajumdar <titu1994@gmail.com>
* Add guards for stateful decoding
Signed-off-by: smajumdar <titu1994@gmail.com>
* Fix state management when no states is provided
Signed-off-by: smajumdar <titu1994@gmail.com>
* Create
Signed-off-by: smajumdar <titu1994@gmail.com>
* Correct logging
Signed-off-by: smajumdar <titu1994@gmail.com>
* Begin support for stateful beam decoding
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update streaming utils with method 2
Signed-off-by: smajumdar <titu1994@gmail.com>
* Initiate stateful beam implementation
Signed-off-by: smajumdar <titu1994@gmail.com>
* Reset changes
Signed-off-by: smajumdar <titu1994@gmail.com>
* Fix style
Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
2021-10-24 22:58:10 -07:00
Taejin Park
9405273498
Add new features to ASR with diarization with modified tutorial and README. ( #3007 )
...
* Added buffer-based ASR and added Conformer CTC model
Signed-off-by: Taejin Park <tango4j@gmail.com>
* SAD fixing word-timestamp for both QuartzNet and ConformerCTC
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Added buffer-based ASR and added Conformer CTC model
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Added ConformerCTC support and VAD word-timestamp fix
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Changed all SAD to VAD
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Style fix and the list of available ASR Models
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Modified ASR_with_SpeakerDiarization.ipynb
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Modified ASR_with_SpeakerDiarization.ipynb
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Removed unnecessary comments
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Reflected PR review comments.
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Reflected PR review comments and fixed typos.
Signed-off-by: Taejin Park <tango4j@gmail.com>
* asr_with_diarization.py reformatting issue
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Reformatted the style fix error.
Signed-off-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
2021-10-22 17:44:36 -07:00
Micha Livne
4452162b93
NMT timing and tokenizer stats utils ( #3004 )
...
* 1. Fixed logging of log_var_q_z_given_x.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated timing of eval_step.
2. Added caching of encoder values when sampling.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed using cached values in eval_step.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated timing names to include "_timing"
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added support in sliding mean.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added the tokenizer stats script.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Moved duplicated code into a method.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added logging of detailed NMT timing for non-bottleneck and bottleneck models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added logging of both src and tgt length when measuring timing.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added plotting script.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added usage.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added missing copyright.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added post-processing for timing data to be collcted from list of dict to a dict of lists.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added warmup batch when measuring timing.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated plot script.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed a typo.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added control over plotting.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added control over font size.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added control over tick font size.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Moved scripts.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
2021-10-22 13:51:04 -04:00
Somshubra Majumdar
1f36f32ee6
Remove STFT checks due to min PT version of 1.10 ( #3034 )
...
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-10-21 15:29:21 -07:00
Eric Harper
80fd431066
[BigNLP] Remove fused kernel files ( #3033 )
...
* remove fused kernel files
Signed-off-by: ericharper <complex451@gmail.com>
* remove fused kernel files
Signed-off-by: ericharper <complex451@gmail.com>
2021-10-21 12:53:15 -06:00
Somshubra Majumdar
9f99918974
Add Transducer documentation ( #3015 )
...
* Add RNNT documentation
Signed-off-by: smajumdar <titu1994@gmail.com>
* Revert unnecessary changes
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update docs for RNNT
Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-10-21 10:40:11 -07:00
Somshubra Majumdar
4e544676f2
Update the webapp for ASR ( #3032 )
...
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-10-21 10:30:45 -07:00
Eric Harper
32fa5cfaf3
[BigNLP] Merge Megatron GPT to main ( #2975 )
...
* fix gpu init after removing debug print in mpu
Signed-off-by: ericharper <complex451@gmail.com>
* add fused_adam
Signed-off-by: ericharper <complex451@gmail.com>
* check ds is not none before logging len
Signed-off-by: ericharper <complex451@gmail.com>
* set fp16 arg to true and fix enum conflict
Signed-off-by: ericharper <complex451@gmail.com>
* make fp16 arg configurable
Signed-off-by: ericharper <complex451@gmail.com>
* add grad clip from megatron
Signed-off-by: ericharper <complex451@gmail.com>
* Linear warmup with cosine annealing and constant holding (#2846 )
* Testing cosine schedule
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* More fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* update config for constant steps in schedule
Signed-off-by: ericharper <complex451@gmail.com>
* temporarily import enum from megatron
Signed-off-by: ericharper <complex451@gmail.com>
* add grad clip for fp32
Signed-off-by: ericharper <complex451@gmail.com>
* update check for _del_model_without_trainer
Signed-off-by: ericharper <complex451@gmail.com>
* updating restore for model parallel
Signed-off-by: ericharper <complex451@gmail.com>
* add predict script
Signed-off-by: ericharper <complex451@gmail.com>
* update test iters
Signed-off-by: ericharper <complex451@gmail.com>
* add barrier
Signed-off-by: ericharper <complex451@gmail.com>
* return if clip_val is 0 or None
Signed-off-by: ericharper <complex451@gmail.com>
* when using amp clip grads after they are unscaled
Signed-off-by: ericharper <complex451@gmail.com>
* make native amp scaler hyperparams configurable
Signed-off-by: ericharper <complex451@gmail.com>
* (1) nvfuser, (2) amp-casting decoration (#2894 )
* (1) nvfuser, (2) amp-casting decoration
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* support bf16
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* update package info
Signed-off-by: ericharper <complex451@gmail.com>
* add set device to constructor
Signed-off-by: ericharper <complex451@gmail.com>
* set_device in constructor
Signed-off-by: ericharper <complex451@gmail.com>
* [BigNLP] Remove megatron-lm dependency. (#2910 )
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update megatron_init
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update process batch
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* add megatron clip_grad
Signed-off-by: ericharper <complex451@gmail.com>
* trying to resolve circular import error
Signed-off-by: ericharper <complex451@gmail.com>
* rename file
Signed-off-by: ericharper <complex451@gmail.com>
* remove non-gpt models and datasets from __init__ files
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set_device in constructor
Signed-off-by: ericharper <complex451@gmail.com>
* clean config
Signed-off-by: ericharper <complex451@gmail.com>
* update MegatronDataset
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16
Signed-off-by: ericharper <complex451@gmail.com>
* rename to fused_fp16
Signed-off-by: ericharper <complex451@gmail.com>
* add fused_fp16 arg to LayerNorm calls
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix import
Signed-off-by: ericharper <complex451@gmail.com>
* update arg
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* Adding complete method to MegatronGPTModel (#2935 )
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* make ffn_hidden_size mandatory
Signed-off-by: ericharper <complex451@gmail.com>
* Manually migrating timing of step into branch (#2937 )
* 1. Manually migrating timing of step into branch.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated file name and content.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated to latest code.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
* remove unused imports
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* check fused_fp16 and fused_bf16 are not both True
Signed-off-by: ericharper <complex451@gmail.com>
* update predict script for model parallel .nemo
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
* NVfuser (#2943 )
* activation checkpoint recompute
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* selective nvfuser setup
* Megatron gpt bfloat support (#2926 )
* Save/restore fix
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Another merge
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Bf16 args in init
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Set precision
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove debug stuff
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* add bf16 casting decorator
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* Bfloat layernorm propagation
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* activation checkpoint recompute
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* selective nvfuser setup
* More arg removal
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove BERTDataset
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* update to latest apex and patch transformer autocast
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
* don't set jit for bf16
Signed-off-by: ericharper <complex451@gmail.com>
* replace apex.mpu
Signed-off-by: ericharper <complex451@gmail.com>
* fix grad clip
Signed-off-by: ericharper <complex451@gmail.com>
* NVFuser fixes (#2951 )
* Fuser fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove dummy handler
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove PTL plugin based logic for fusion
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* remove duplicated file
Signed-off-by: ericharper <complex451@gmail.com>
* typo (#2960 )
Signed-off-by: ericharper <complex451@gmail.com>
* [BigNLP] Script to convert GPT checkpoint to .nemo (#2958 )
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update megatron_init
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update process batch
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* add megatron clip_grad
Signed-off-by: ericharper <complex451@gmail.com>
* trying to resolve circular import error
Signed-off-by: ericharper <complex451@gmail.com>
* rename file
Signed-off-by: ericharper <complex451@gmail.com>
* remove non-gpt models and datasets from __init__ files
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set_device in constructor
Signed-off-by: ericharper <complex451@gmail.com>
* clean config
Signed-off-by: ericharper <complex451@gmail.com>
* update MegatronDataset
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16
Signed-off-by: ericharper <complex451@gmail.com>
* rename to fused_fp16
Signed-off-by: ericharper <complex451@gmail.com>
* add fused_fp16 arg to LayerNorm calls
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix import
Signed-off-by: ericharper <complex451@gmail.com>
* update arg
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* Adding complete method to MegatronGPTModel (#2935 )
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* make ffn_hidden_size mandatory
Signed-off-by: ericharper <complex451@gmail.com>
* Manually migrating timing of step into branch (#2937 )
* 1. Manually migrating timing of step into branch.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated file name and content.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated to latest code.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
* remove unused imports
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* check fused_fp16 and fused_bf16 are not both True
Signed-off-by: ericharper <complex451@gmail.com>
* update predict script for model parallel .nemo
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
* add script to convert .ckpt to .nemo
Signed-off-by: ericharper <complex451@gmail.com>
* in progress
Signed-off-by: ericharper <complex451@gmail.com>
* update
Signed-off-by: ericharper <complex451@gmail.com>
* convert mp checkpoints to nemo
Signed-off-by: ericharper <complex451@gmail.com>
* update help
Signed-off-by: ericharper <complex451@gmail.com>
* add safeguard for model parallel save_to
Signed-off-by: ericharper <complex451@gmail.com>
* adjust NLPModel save_to to be safer for model parallel
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* [BigNLP] Update GPT evaluation to work with tensor model parallel (#2959 )
* in progress
Signed-off-by: ericharper <complex451@gmail.com>
* update args
Signed-off-by: ericharper <complex451@gmail.com>
* add request dataset
Signed-off-by: ericharper <complex451@gmail.com>
* tokenize request
Signed-off-by: ericharper <complex451@gmail.com>
* in progress
Signed-off-by: ericharper <complex451@gmail.com>
* able to run
Signed-off-by: ericharper <complex451@gmail.com>
* reduce logits
Signed-off-by: ericharper <complex451@gmail.com>
* capture response
Signed-off-by: ericharper <complex451@gmail.com>
* squeeze and unsqueeze
Signed-off-by: ericharper <complex451@gmail.com>
* handle non model parallel case
Signed-off-by: ericharper <complex451@gmail.com>
* clean imports
Signed-off-by: ericharper <complex451@gmail.com>
* add file
Signed-off-by: ericharper <complex451@gmail.com>
* convert logits to log_probs
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* rename logits to log_probs
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* add megatron gpt pretraining
Signed-off-by: ericharper <complex451@gmail.com>
* add megatron gpt pretraining
Signed-off-by: ericharper <complex451@gmail.com>
* add megatron gpt pretraining
Signed-off-by: ericharper <complex451@gmail.com>
* updating to work with latest megatron
Signed-off-by: ericharper <complex451@gmail.com>
* updating to work with latest megatron
Signed-off-by: ericharper <complex451@gmail.com>
* update _del_model
Signed-off-by: ericharper <complex451@gmail.com>
* adding gpt model
Signed-off-by: ericharper <complex451@gmail.com>
* adding gpt model
Signed-off-by: ericharper <complex451@gmail.com>
* adding gpt model
Signed-off-by: ericharper <complex451@gmail.com>
* instantiate GPTmodel
Signed-off-by: ericharper <complex451@gmail.com>
* adding build dataset
Signed-off-by: ericharper <complex451@gmail.com>
* build megatron dataset in .setup
Signed-off-by: ericharper <complex451@gmail.com>
* setup dataloader
Signed-off-by: ericharper <complex451@gmail.com>
* add vocab_file and merge_file to megatron init
Signed-off-by: ericharper <complex451@gmail.com>
* add forward
Signed-off-by: ericharper <complex451@gmail.com>
* add train loss
Signed-off-by: ericharper <complex451@gmail.com>
* add optimizer
Signed-off-by: ericharper <complex451@gmail.com>
* add exp_manager
Signed-off-by: ericharper <complex451@gmail.com>
* multi-gpu is working
Signed-off-by: ericharper <complex451@gmail.com>
* adding val loop
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* adding val loop
Signed-off-by: ericharper <complex451@gmail.com>
* fix ranks
Signed-off-by: ericharper <complex451@gmail.com>
* fix model parallel checkpoint saving
Signed-off-by: ericharper <complex451@gmail.com>
* fix _del_model
Signed-off-by: ericharper <complex451@gmail.com>
* added megatron batch sampler
Signed-off-by: ericharper <complex451@gmail.com>
* try to fix num steps
Signed-off-by: ericharper <complex451@gmail.com>
* add wandb to config
Signed-off-by: ericharper <complex451@gmail.com>
* log lr
Signed-off-by: ericharper <complex451@gmail.com>
* add warmup ratio to config
Signed-off-by: ericharper <complex451@gmail.com>
* update configs
Signed-off-by: ericharper <complex451@gmail.com>
* update configs
Signed-off-by: ericharper <complex451@gmail.com>
* add cpu init to args
Signed-off-by: ericharper <complex451@gmail.com>
* update config
Signed-off-by: ericharper <complex451@gmail.com>
* update config
Signed-off-by: ericharper <complex451@gmail.com>
* Initial megatron dataset port
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Fix merge conflicts
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* License fixes and megatron model porting
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* More fixes to import from nemo rather than megatron
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Fix circular imports
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Revert config file
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Restructure further to avoid circular imports
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* add Makefile
Signed-off-by: ericharper <complex451@gmail.com>
* Add megatron modules
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* add license
Signed-off-by: ericharper <complex451@gmail.com>
* Port from latest megatron
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* update cfg
Signed-off-by: ericharper <complex451@gmail.com>
* update config
Signed-off-by: ericharper <complex451@gmail.com>
* add _del_model_without_trainer
Signed-off-by: ericharper <complex451@gmail.com>
* add data preprocessing script
Signed-off-by: ericharper <complex451@gmail.com>
* update config
Signed-off-by: ericharper <complex451@gmail.com>
* use apex mpu
Signed-off-by: ericharper <complex451@gmail.com>
* replace print_rank_0 with nemo utils logging
Signed-off-by: ericharper <complex451@gmail.com>
* use apex mpu
Signed-off-by: ericharper <complex451@gmail.com>
* use apex mpu
Signed-off-by: ericharper <complex451@gmail.com>
* add use_cpu_initialization
Signed-off-by: ericharper <complex451@gmail.com>
* fixing autoresume in progress
Signed-off-by: ericharper <complex451@gmail.com>
* properly removing last checkpoint
Signed-off-by: ericharper <complex451@gmail.com>
* log consumed samples
Signed-off-by: ericharper <complex451@gmail.com>
* fix mp autoresume
Signed-off-by: ericharper <complex451@gmail.com>
* add NLPSaveRestoreConnector
Signed-off-by: ericharper <complex451@gmail.com>
* Megatron GPT training with NeMo tokenizers (#2818 )
* Update files from megatron repo
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove non NLP data related files from megatron
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Merge megatron and nemo tokenizers
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove get_tokenizer() calls from gpt model
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Update tokenizer yaml config
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* add todo
Signed-off-by: ericharper <complex451@gmail.com>
* update config
Signed-off-by: ericharper <complex451@gmail.com>
* make init_method_std configurable
Signed-off-by: ericharper <complex451@gmail.com>
* make gpu init work by setting random seed earlier
Signed-off-by: ericharper <complex451@gmail.com>
* fix gpu init after removing debug print in mpu
Signed-off-by: ericharper <complex451@gmail.com>
* add fused_adam
Signed-off-by: ericharper <complex451@gmail.com>
* check ds is not none before logging len
Signed-off-by: ericharper <complex451@gmail.com>
* set fp16 arg to true and fix enum conflict
Signed-off-by: ericharper <complex451@gmail.com>
* make fp16 arg configurable
Signed-off-by: ericharper <complex451@gmail.com>
* add grad clip from megatron
Signed-off-by: ericharper <complex451@gmail.com>
* Linear warmup with cosine annealing and constant holding (#2846 )
* Testing cosine schedule
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Style fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* More fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* update config for constant steps in schedule
Signed-off-by: ericharper <complex451@gmail.com>
* temporarily import enum from megatron
Signed-off-by: ericharper <complex451@gmail.com>
* add grad clip for fp32
Signed-off-by: ericharper <complex451@gmail.com>
* update check for _del_model_without_trainer
Signed-off-by: ericharper <complex451@gmail.com>
* updating restore for model parallel
Signed-off-by: ericharper <complex451@gmail.com>
* add predict script
Signed-off-by: ericharper <complex451@gmail.com>
* update test iters
Signed-off-by: ericharper <complex451@gmail.com>
* add barrier
Signed-off-by: ericharper <complex451@gmail.com>
* return if clip_val is 0 or None
Signed-off-by: ericharper <complex451@gmail.com>
* when using amp clip grads after they are unscaled
Signed-off-by: ericharper <complex451@gmail.com>
* make native amp scaler hyperparams configurable
Signed-off-by: ericharper <complex451@gmail.com>
* (1) nvfuser, (2) amp-casting decoration (#2894 )
* (1) nvfuser, (2) amp-casting decoration
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* support bf16
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* update package info
Signed-off-by: ericharper <complex451@gmail.com>
* add set device to constructor
Signed-off-by: ericharper <complex451@gmail.com>
* set_device in constructor
Signed-off-by: ericharper <complex451@gmail.com>
* [BigNLP] Remove megatron-lm dependency. (#2910 )
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update megatron_init
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update process batch
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* add megatron clip_grad
Signed-off-by: ericharper <complex451@gmail.com>
* trying to resolve circular import error
Signed-off-by: ericharper <complex451@gmail.com>
* rename file
Signed-off-by: ericharper <complex451@gmail.com>
* remove non-gpt models and datasets from __init__ files
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set_device in constructor
Signed-off-by: ericharper <complex451@gmail.com>
* clean config
Signed-off-by: ericharper <complex451@gmail.com>
* update MegatronDataset
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16
Signed-off-by: ericharper <complex451@gmail.com>
* rename to fused_fp16
Signed-off-by: ericharper <complex451@gmail.com>
* add fused_fp16 arg to LayerNorm calls
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix import
Signed-off-by: ericharper <complex451@gmail.com>
* update arg
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* Adding complete method to MegatronGPTModel (#2935 )
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* make ffn_hidden_size mandatory
Signed-off-by: ericharper <complex451@gmail.com>
* Manually migrating timing of step into branch (#2937 )
* 1. Manually migrating timing of step into branch.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated file name and content.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated to latest code.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
* remove unused imports
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* check fused_fp16 and fused_bf16 are not both True
Signed-off-by: ericharper <complex451@gmail.com>
* update predict script for model parallel .nemo
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
* NVfuser (#2943 )
* activation checkpoint recompute
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* selective nvfuser setup
* Megatron gpt bfloat support (#2926 )
* Save/restore fix
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Another merge
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Bf16 args in init
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Set precision
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove debug stuff
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* add bf16 casting decorator
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* Bfloat layernorm propagation
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* activation checkpoint recompute
Signed-off-by: Sangkug Lym <slym@nvidia.com>
* selective nvfuser setup
* More arg removal
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove BERTDataset
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* update to latest apex and patch transformer autocast
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
* don't set jit for bf16
Signed-off-by: ericharper <complex451@gmail.com>
* replace apex.mpu
Signed-off-by: ericharper <complex451@gmail.com>
* fix grad clip
Signed-off-by: ericharper <complex451@gmail.com>
* NVFuser fixes (#2951 )
* Fuser fixes
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove dummy handler
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Remove PTL plugin based logic for fusion
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* remove duplicated file
Signed-off-by: ericharper <complex451@gmail.com>
* typo (#2960 )
Signed-off-by: ericharper <complex451@gmail.com>
* [BigNLP] Script to convert GPT checkpoint to .nemo (#2958 )
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* remove args in progress
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add load_fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update megatron_init
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* add fused kernels
Signed-off-by: ericharper <complex451@gmail.com>
* update process batch
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* remove erroneous import
Signed-off-by: ericharper <complex451@gmail.com>
* add megatron clip_grad
Signed-off-by: ericharper <complex451@gmail.com>
* trying to resolve circular import error
Signed-off-by: ericharper <complex451@gmail.com>
* rename file
Signed-off-by: ericharper <complex451@gmail.com>
* remove non-gpt models and datasets from __init__ files
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set device in constructorfor gpu init
Signed-off-by: ericharper <complex451@gmail.com>
* set_device in constructor
Signed-off-by: ericharper <complex451@gmail.com>
* clean config
Signed-off-by: ericharper <complex451@gmail.com>
* update MegatronDataset
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* clean up MegatronModule
Signed-off-by: ericharper <complex451@gmail.com>
* rename fp16 and bf16 flags to fused_softmax_input_in_fp16/bf16
Signed-off-by: ericharper <complex451@gmail.com>
* rename to fused_fp16
Signed-off-by: ericharper <complex451@gmail.com>
* add fused_fp16 arg to LayerNorm calls
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix arg name
Signed-off-by: ericharper <complex451@gmail.com>
* fix import
Signed-off-by: ericharper <complex451@gmail.com>
* update arg
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* skip warmup default to True
Signed-off-by: ericharper <complex451@gmail.com>
* Adding complete method to MegatronGPTModel (#2935 )
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* make ffn_hidden_size mandatory
Signed-off-by: ericharper <complex451@gmail.com>
* Manually migrating timing of step into branch (#2937 )
* 1. Manually migrating timing of step into branch.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated file name and content.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated to latest code.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
* remove unused imports
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused import
Signed-off-by: ericharper <complex451@gmail.com>
* check fused_fp16 and fused_bf16 are not both True
Signed-off-by: ericharper <complex451@gmail.com>
* update predict script for model parallel .nemo
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
* add script to convert .ckpt to .nemo
Signed-off-by: ericharper <complex451@gmail.com>
* in progress
Signed-off-by: ericharper <complex451@gmail.com>
* update
Signed-off-by: ericharper <complex451@gmail.com>
* convert mp checkpoints to nemo
Signed-off-by: ericharper <complex451@gmail.com>
* update help
Signed-off-by: ericharper <complex451@gmail.com>
* add safeguard for model parallel save_to
Signed-off-by: ericharper <complex451@gmail.com>
* adjust NLPModel save_to to be safer for model parallel
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* [BigNLP] Update GPT evaluation to work with tensor model parallel (#2959 )
* in progress
Signed-off-by: ericharper <complex451@gmail.com>
* update args
Signed-off-by: ericharper <complex451@gmail.com>
* add request dataset
Signed-off-by: ericharper <complex451@gmail.com>
* tokenize request
Signed-off-by: ericharper <complex451@gmail.com>
* in progress
Signed-off-by: ericharper <complex451@gmail.com>
* able to run
Signed-off-by: ericharper <complex451@gmail.com>
* reduce logits
Signed-off-by: ericharper <complex451@gmail.com>
* capture response
Signed-off-by: ericharper <complex451@gmail.com>
* squeeze and unsqueeze
Signed-off-by: ericharper <complex451@gmail.com>
* handle non model parallel case
Signed-off-by: ericharper <complex451@gmail.com>
* clean imports
Signed-off-by: ericharper <complex451@gmail.com>
* add file
Signed-off-by: ericharper <complex451@gmail.com>
* convert logits to log_probs
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* rename logits to log_probs
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* fix copyright headers
Signed-off-by: ericharper <complex451@gmail.com>
* fix copyright headers
Signed-off-by: ericharper <complex451@gmail.com>
* remove old TimingCallback
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* update jenkins to use latest apex and sandeep's fork
Signed-off-by: ericharper <complex451@gmail.com>
* update jenkins
Signed-off-by: ericharper <complex451@gmail.com>
* update jenkins
Signed-off-by: ericharper <complex451@gmail.com>
* update jenkins
Signed-off-by: ericharper <complex451@gmail.com>
* update jenkins
Signed-off-by: ericharper <complex451@gmail.com>
* try 2109 container
Signed-off-by: ericharper <complex451@gmail.com>
* try cuda container
Signed-off-by: ericharper <complex451@gmail.com>
* use internal container
Signed-off-by: ericharper <complex451@gmail.com>
* update checkpoint tests
Signed-off-by: ericharper <complex451@gmail.com>
* fix scheduler args
Signed-off-by: ericharper <complex451@gmail.com>
* update eval
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* update jenkins to use ptl 1.5 rc
Signed-off-by: ericharper <complex451@gmail.com>
* add import guard to jenkins
Signed-off-by: ericharper <complex451@gmail.com>
* add import guard to jenkins
Signed-off-by: ericharper <complex451@gmail.com>
* remove deterministic
Signed-off-by: ericharper <complex451@gmail.com>
* install numba .53
Signed-off-by: ericharper <complex451@gmail.com>
* allow for more variance
Signed-off-by: ericharper <complex451@gmail.com>
* update trainer config dataclass
Signed-off-by: ericharper <complex451@gmail.com>
* test_get_optimizer on gpu
Signed-off-by: ericharper <complex451@gmail.com>
* revert comment
Signed-off-by: ericharper <complex451@gmail.com>
* change trainer config default to 32
Signed-off-by: ericharper <complex451@gmail.com>
* [BigNLP] Remove fused kernel code instead use Apex (#2984 )
* remove fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* remove fused_kernels
Signed-off-by: ericharper <complex451@gmail.com>
* remove fused layer norm and fused softmax and use apex instead
Signed-off-by: ericharper <complex451@gmail.com>
* update imports
Signed-off-by: ericharper <complex451@gmail.com>
* remove comment
Signed-off-by: ericharper <complex451@gmail.com>
* use apex enums
Signed-off-by: ericharper <complex451@gmail.com>
* use apex enums
Signed-off-by: ericharper <complex451@gmail.com>
* add tab
Signed-off-by: ericharper <complex451@gmail.com>
* Timer with sliding window (#3002 )
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
* revert tab
Signed-off-by: ericharper <complex451@gmail.com>
* check for rank zero
Signed-off-by: ericharper <complex451@gmail.com>
* check for rank zero
Signed-off-by: ericharper <complex451@gmail.com>
* try explicit log dir
Signed-off-by: ericharper <complex451@gmail.com>
* add +
Signed-off-by: ericharper <complex451@gmail.com>
* don't rm
Signed-off-by: ericharper <complex451@gmail.com>
* make dir if it doesn't exist
Signed-off-by: ericharper <complex451@gmail.com>
* create mp nemo file in temp directory
Signed-off-by: ericharper <complex451@gmail.com>
* simplify mp save_to
Signed-off-by: ericharper <complex451@gmail.com>
* handle mp 1 case
Signed-off-by: ericharper <complex451@gmail.com>
* style fix
Signed-off-by: ericharper <complex451@gmail.com>
* remove files
Signed-off-by: ericharper <complex451@gmail.com>
* fix consumed_samples when resuming
Signed-off-by: ericharper <complex451@gmail.com>
* fix reinstall.sh
Signed-off-by: ericharper <complex451@gmail.com>
* update req
Signed-off-by: ericharper <complex451@gmail.com>
* add more detailed log for dataloaders
Signed-off-by: ericharper <complex451@gmail.com>
* check if cuda is available before using fused_adam
Signed-off-by: ericharper <complex451@gmail.com>
* revert comment
Signed-off-by: ericharper <complex451@gmail.com>
* update eval script to use model.freeze
Signed-off-by: ericharper <complex451@gmail.com>
* log train loss averaged over gradient accumulation steps
Signed-off-by: ericharper <complex451@gmail.com>
* check copyright earlier
Signed-off-by: ericharper <complex451@gmail.com>
* todo
Signed-off-by: ericharper <complex451@gmail.com>
* override SaveRestoreConnector in NLPModel init
Signed-off-by: ericharper <complex451@gmail.com>
* move to scripts
Signed-off-by: ericharper <complex451@gmail.com>
* remove star import
Signed-off-by: ericharper <complex451@gmail.com>
* remove comments
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused dataset
Signed-off-by: ericharper <complex451@gmail.com>
* removed barrier
Signed-off-by: ericharper <complex451@gmail.com>
* check cfg
Signed-off-by: ericharper <complex451@gmail.com>
* remove logging
Signed-off-by: ericharper <complex451@gmail.com>
* freeze, unfreeze
Signed-off-by: ericharper <complex451@gmail.com>
* return None
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused imports
Signed-off-by: ericharper <complex451@gmail.com>
* add TODO
Signed-off-by: ericharper <complex451@gmail.com>
* typecheck
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
* todo
Signed-off-by: ericharper <complex451@gmail.com>
* add common native plugin
Signed-off-by: ericharper <complex451@gmail.com>
* restore with trainer
Signed-off-by: ericharper <complex451@gmail.com>
* style
Signed-off-by: ericharper <complex451@gmail.com>
* deprecate megatron-lm bert
Signed-off-by: ericharper <complex451@gmail.com>
* deprecate megatron-lm bert
Signed-off-by: ericharper <complex451@gmail.com>
* compile helpers ont he fly
Signed-off-by: ericharper <complex451@gmail.com>
* remove amp_level
Signed-off-by: ericharper <complex451@gmail.com>
* remove amp_level from configs
Signed-off-by: ericharper <complex451@gmail.com>
* add missing import
Signed-off-by: ericharper <complex451@gmail.com>
* typo
Signed-off-by: ericharper <complex451@gmail.com>
* remove amp_level
Signed-off-by: ericharper <complex451@gmail.com>
* use fast huggingface tokenizers by default
Signed-off-by: ericharper <complex451@gmail.com>
* deal with huggingface tokenizer positional args
Signed-off-by: ericharper <complex451@gmail.com>
* deal with huggingface tokenizer positional args
Signed-off-by: ericharper <complex451@gmail.com>
* deal with huggingface tokenizer positional args
Signed-off-by: ericharper <complex451@gmail.com>
* revert use_fast default to False
Signed-off-by: ericharper <complex451@gmail.com>
* return super training_epoch_end
Signed-off-by: ericharper <complex451@gmail.com>
* remove optimizer_idx arg from training_step
Signed-off-by: ericharper <complex451@gmail.com>
* remove unused arg from on_train_epoch_end
Signed-off-by: ericharper <complex451@gmail.com>
* add restore_from_path to nemo config
Signed-off-by: ericharper <complex451@gmail.com>
* add comment
Signed-off-by: ericharper <complex451@gmail.com>
* revert
Signed-off-by: ericharper <complex451@gmail.com>
* override connector if not subclassing NLPSaveRestoreConnector for model parallel save
Signed-off-by: ericharper <complex451@gmail.com>
* update test optimizer
Signed-off-by: ericharper <complex451@gmail.com>
* clean up
Signed-off-by: ericharper <complex451@gmail.com>
* clean up
Signed-off-by: ericharper <complex451@gmail.com>
* clean up
Signed-off-by: ericharper <complex451@gmail.com>
* clean up
Signed-off-by: ericharper <complex451@gmail.com>
* make data_prefix mandatory in config
Signed-off-by: ericharper <complex451@gmail.com>
* update installation instructions on readme
Signed-off-by: ericharper <complex451@gmail.com>
* update dockerfile
Signed-off-by: ericharper <complex451@gmail.com>
* add todo
Signed-off-by: ericharper <complex451@gmail.com>
* raise error if trying to use always_save_nemo with model parallel model
Signed-off-by: ericharper <complex451@gmail.com>
* remove comment
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sangkug Lym <slym@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-10-20 21:06:37 -06:00
Yang Zhang
620e8a8986
delete nltk download for TN ( #3028 )
...
* delete nltk
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* style fix
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* remove unused import
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-10-20 14:51:23 -07:00
Jason
db47f5dcd6
minor ( #3027 )
...
Signed-off-by: Jason <jasoli@nvidia.com>
2021-10-20 13:48:44 -04:00
Jason
689f8c59ac
Update Finetuning Notebook ( #3023 )
...
* update note
Signed-off-by: Jason <jasoli@nvidia.com>
* update checkpoint
Signed-off-by: Jason <jasoli@nvidia.com>
* minor updates
Signed-off-by: Jason <jasoli@nvidia.com>
* update
Signed-off-by: Jason <jasoli@nvidia.com>
* update
Signed-off-by: Jason <jasoli@nvidia.com>
* meta
Signed-off-by: Jason <jasoli@nvidia.com>
* update
Signed-off-by: Jason <jasoli@nvidia.com>
* fix
Signed-off-by: Jason <jasoli@nvidia.com>
2021-10-20 13:36:07 -04:00
PeganovAnton
d9b51f0553
Fixes in GlobalAverageLossMetric
and PerplexityMetric
( #3016 )
...
* Fix assertions
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Many fixes
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Make minor improvement in reference function
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Improve new tensor creation
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add debug prints
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Remove debug stuff
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-10-20 01:15:55 -07:00
PeganovAnton
8dbd0a0107
Avoid implicit array changing, add progress bar for punctuation and capitalization, help message improvement. ( #3018 )
...
* Fix minor bugs
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add tqdm for punctuation and capitalization
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add progress unit
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
2021-10-19 11:58:20 +03:00
Somshubra Majumdar
78611b1a27
Improve efficiency of dataloaders for transcribe() ( #3014 )
...
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-10-15 13:40:45 -07:00
Vahid Noroozi
77f30c5047
Change the min value used for masking in Conformer. ( #2997 )
2021-10-14 00:38:57 -07:00
Vahid Noroozi
a07dfa9d78
Adding bucketing for ASR models with tarred datasets ( #2990 )
2021-10-13 21:40:50 -07:00
fayejf
0e5a1f84c4
tiny edge case fix for VAD postprocessing ( #2996 )
...
* tiny edge case fix
Signed-off-by: fayejf <fayejf07@gmail.com>
* style fix
Signed-off-by: fayejf <fayejf07@gmail.com>
2021-10-13 18:51:42 -04:00
Jocelyn
9bbbe2e403
Add Paarth's HiFi-GAN and Tacotron fine-tuning code ( #3000 )
...
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
2021-10-13 13:38:50 -07:00
Micha Livne
38e74de2b9
NMT evaluation timing ( #2956 )
...
* 1. Fixed logging of log_var_q_z_given_x.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated timing of eval_step.
2. Added caching of encoder values when sampling.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed using cached values in eval_step.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated timing names to include "_timing"
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added support in sliding mean.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
2021-10-13 09:41:49 -04:00
Evelina
a0e8018fd7
TN updates ( #2983 )
...
* moses added
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* updates to make eval with moses work
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* clean up
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* clean up
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* fix init
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* review
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
2021-10-11 15:56:25 -07:00