bene-ges
5b603fb80c
typos ( #2989 )
...
Signed-off-by: Alexandra Antonova <aleksandraa@nvidia.com>
Co-authored-by: Alexandra Antonova <aleksandraa@nvidia.com>
2021-10-11 14:48:25 -07:00
Carol Anderson
7de97d71c0
update zero shot intent model ( #2977 )
...
* update zero shot intent model
Signed-off-by: Carol Anderson <carola@nvidia.com>
* remove from_pretrained from TextClassificationModel
Signed-off-by: Carol Anderson <carola@nvidia.com>
2021-10-11 12:45:34 -07:00
Boris Fomitchev
1a75dc5230
Fixing BERT export and ORT check ( #2965 )
...
* Fixing BERT export and ORT check
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Fixed test
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Addressing code review comments
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
2021-10-08 17:46:08 -07:00
Jason
be7114e2d9
Update README.rst ( #2973 )
...
Signed-off-by: Jason <jasoli@nvidia.com>
2021-10-08 11:33:11 -06:00
Jagadeesh Balam
9c26f5d533
Narrowband augmentation for ASR models ( #2946 )
...
* Added narrowband augmentatation during on spectrogram, added ogg codec to transcodeperturbation
Signed-off-by: jbalam <jbalam@nvidia.com>
* Changes to AudioToMelSpectrogramProcessor for nb augmentation
Signed-off-by: jbalam <jbalam@nvidia.com>
* Minor clean up
Signed-off-by: jbalam-nv <4916480+jbalam-nv@users.noreply.github.com>
* removed unused arguments
Signed-off-by: jbalam-nv <4916480+jbalam-nv@users.noreply.github.com>
* style fix
Signed-off-by: jbalam <jbalam@nvidia.com>
* Added new arguments to config
Signed-off-by: jbalam-nv <4916480+jbalam-nv@users.noreply.github.com>
* Fixes to config changes causing test failures
Signed-off-by: jbalam <jbalam@nvidia.com>
* changed check for applying attenuation
Signed-off-by: jbalam-nv <4916480+jbalam-nv@users.noreply.github.com>
2021-10-07 15:37:26 -07:00
Fedor
5ec2c5c18e
reading max_sequence_len parameter from config fixed ( #2961 )
...
Signed-off-by: Fedor Streltsov <sfeaal@gmail.com>
2021-10-07 11:30:27 -07:00
Micha Livne
6796faa62e
1. Fixing undeclared variables. ( #2939 )
...
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-10-07 03:45:25 -04:00
Sandeep Subramanian
f7b6f14fa4
Rename neural machine translation to text2sparql ( #2955 )
...
* Rename neural machine translation to text2sparql
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Import fix
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
2021-10-06 17:03:00 -07:00
Eric Harper
91fd9ea970
Merge final doc and bug fixes from r1.4.0 to main ( #2952 )
...
* update branch for jenkinsfile and dockerfile
Signed-off-by: ericharper <complex451@gmail.com>
* Typos (#2884 )
* segmentation tutorial fix
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* data fixes
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* Minor Fixes (#2922 )
* typo
Signed-off-by: Jason <jasoli@nvidia.com>
* remove notebook from docs
Signed-off-by: Jason <jasoli@nvidia.com>
* Adding Conformer-Transducer docs. (#2920 )
* added Conformer-Transducer docs.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Added contextnet.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed the title.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Fix numba spec augment for cases where batch size > MAX_THREAD_BUFFER (#2924 )
* Fix numba spec augment for cases where batch size > MAX_THREAD_BUFFER
Signed-off-by: smajumdar <titu1994@gmail.com>
* Revert print in test
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update readme for r1.4.0 (#2927 )
* Updated readme for r1.4.0.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Updated readme for r1.4.0.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Updated readme for r1.4.0.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Updated readme for r1.4.0.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Updated readme for r1.4.0.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Updated readme for r1.4.0.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* Updated readme for r1.4.0.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* New NMT Models (#2925 )
* New pretrained models
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Update NMT docs
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Eric Harper <complex451@gmail.com>
* update branch
Signed-off-by: ericharper <complex451@gmail.com>
* revert
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-10-06 08:21:54 -06:00
tbartley94
d8924ffb2c
Itn fr ( #2947 )
...
* typos (#2909 )
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Updated docs (#2911 )
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Nmt encoder decoder hidden size fix (#2856 )
* 1. Enabled encoder/decoder with different size in bottleneck architecture.
2. Validating encoder/decoder with the same size in non-bottleneck parent class.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed typo.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added hidden_size ot error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing defaults.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixing CI tests to have same hidden_size.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating Jenkins CI test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating CI to hidden=48
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size when loading pre-trained huggingface model.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size in config for pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated missng hidden_size in config.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Testing encoder and decoder objects' hidden_size instead of config to support pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated Jenkinsfile test values.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed Jenkinsfile test values (NMT Megatron Model Parallel Size 2 Encoder)
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating missing arguments for Jenkinsfile test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* First commit. French ITN grammars for tagger and verbalizer. Test for French inverse_normalize added to tests. inverse_text_normalize updated to allow 'fr' tag. tools/text_processing/deployment/pynini_export.py updated to accept 'fr' tag. All CI tests for grammars passed.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Ran style checker.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Fixed bug causing ordinals to fail sparrowhawk test when verbalizing as roman numbers.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* style change for verbalizer/ordinal.py
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Add DALI dataset unit test (#2904 )
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Delete test.py
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Cleaning up unused import spaces for lgtm check.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* taggers/time.py missed style checker
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* inverse_text_normalization/fr lacked an __init__ file
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Merge r1.4 bugfixes to main (#2918 )
* update package info
Signed-off-by: ericharper <complex451@gmail.com>
* update branch for jenkinsfile and dockerfile
Signed-off-by: ericharper <complex451@gmail.com>
* Adding conformer-transducer models. (#2717 )
* added the models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added contextnet models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added german and chinese models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fix the abs_pos of conformer. (#2863 )
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* update to match sde (#2867 )
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* updated german ngc model (#2871 )
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* Lower bound PTL to safe version (#2876 )
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update notebooks with onnxruntime (#2880 )
Signed-off-by: smajumdar <titu1994@gmail.com>
* Upperbound PTL (#2881 )
Signed-off-by: smajumdar <titu1994@gmail.com>
* minor typo and broken link fixes (#2883 )
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* Remove numbers from TTS tutorial names (#2882 )
* Remove numbers from TTS tutorial names
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
* Update documentation links
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
* Typos (#2884 )
* segmentation tutorial fix
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* data fixes
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* updated the messages in eval_beamsearch_ngram.py. (#2889 )
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* style (#2890 )
Signed-off-by: Jason <jasoli@nvidia.com>
* Fix broken link (#2891 )
* fix broken link
Signed-off-by: fayejf <fayejf07@gmail.com>
* more
Signed-off-by: fayejf <fayejf07@gmail.com>
* Update sclite eval for new transcription method (#2893 )
* Update sclite to use updated inference
Signed-off-by: smajumdar <titu1994@gmail.com>
* Remove WER
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update sclite script to use new inference methods
Signed-off-by: smajumdar <titu1994@gmail.com>
* Remove hub 5
Signed-off-by: smajumdar <titu1994@gmail.com>
* Fix TransformerDecoder export - r1.4 (#2875 )
* export fix
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* embedding pos
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* remove bool param
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* changes
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
* Update Finetuning notebook (#2906 )
* update notebook
Signed-off-by: Jason <jasoli@nvidia.com>
* rename
Signed-off-by: Jason <jasoli@nvidia.com>
* rename
Signed-off-by: Jason <jasoli@nvidia.com>
* revert branch to main
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Fix several bugs in punctuation and capitalization inference and make minor improvements (#2905 )
* Add save labels arg to method and remove device setting
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix device bug and reading plain text bug
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Make minor improvements
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Remove excess parameter
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* add fix to not add dot everywhere (#2885 )
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Fixing copyright wording and adding whitelisting for titles.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Fixing copyright headers.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* copyright header change (missed whitelist)
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Edited export_grammars.sh notes to include 'fr'. Made verbalizer/decimal.py rewrite class part of main class instead.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Adjusting copyright headers for tests.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* inverse_text_normalization/fr/__init__ copyright header
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* addint __init__ file to fr/data
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* TN infer (#2929 )
* en_small grammars added
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* infer fix
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* add whitelist arg
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* add input fall back
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* docstring
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Experiment manager step timing (#2936 )
* 1. Enabled encoder/decoder with different size in bottleneck architecture.
2. Validating encoder/decoder with the same size in non-bottleneck parent class.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed typo.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added hidden_size ot error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing defaults.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixing CI tests to have same hidden_size.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating Jenkins CI test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating CI to hidden=48
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size when loading pre-trained huggingface model.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size in config for pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated missng hidden_size in config.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Testing encoder and decoder objects' hidden_size instead of config to support pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated Jenkinsfile test values.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed Jenkinsfile test values (NMT Megatron Model Parallel Size 2 Encoder)
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating missing arguments for Jenkinsfile test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added a generic timer class.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Renamed file.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. exp_manager timing of train/val/test using callbaks is ready.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. FIxed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Step timing hooks are tested. Logging does not record values due to a bug (should be solved with upgraded ptl)
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added workaround hooks to MTEncDecModel.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed logging issue. All NeMo models support timing.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Removed unused timer object.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added missing copyright.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. NamedTimer supports multiple reductions.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Removed leftover file.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating code to latest.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated docstring.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added exp_manager.step_timing_sync_cuda to config to enable cuda sync on start/stop (False by default).
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed variable names.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added exp_manager.step_timing_kwargs nested config for clarity and future extensibility.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed formatting.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added train_backward_timing timing.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Testing for optional none timing kwargs.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Removed 'cents' from minor currency denominations to avoid ambiguity issues with cardinals.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Wrote quick readme explaining orthography variation for French ITN.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Update README.md
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Update README.md
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Update README.md
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Removed pence from minor currencies, added back in.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
Co-authored-by: Micha Livne <michalivne@users.noreply.github.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Joaquin Anton <janton@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: PeganovAnton <peganoff2@mail.ru>
2021-10-05 13:13:03 -07:00
Micha Livne
ec6591e76a
Experiment manager step timing ( #2936 )
...
* 1. Enabled encoder/decoder with different size in bottleneck architecture.
2. Validating encoder/decoder with the same size in non-bottleneck parent class.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed typo.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added hidden_size ot error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing defaults.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixing CI tests to have same hidden_size.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating Jenkins CI test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating CI to hidden=48
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size when loading pre-trained huggingface model.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size in config for pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated missng hidden_size in config.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Testing encoder and decoder objects' hidden_size instead of config to support pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated Jenkinsfile test values.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed Jenkinsfile test values (NMT Megatron Model Parallel Size 2 Encoder)
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating missing arguments for Jenkinsfile test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added a generic timer class.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Renamed file.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. exp_manager timing of train/val/test using callbaks is ready.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. FIxed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Step timing hooks are tested. Logging does not record values due to a bug (should be solved with upgraded ptl)
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added workaround hooks to MTEncDecModel.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed logging issue. All NeMo models support timing.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Removed unused timer object.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added missing copyright.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. NamedTimer supports multiple reductions.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Removed leftover file.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating code to latest.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated docstring.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added exp_manager.step_timing_sync_cuda to config to enable cuda sync on start/stop (False by default).
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed variable names.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added exp_manager.step_timing_kwargs nested config for clarity and future extensibility.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed formatting.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added train_backward_timing timing.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Testing for optional none timing kwargs.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-10-01 17:53:37 -04:00
Evelina
5f5a9a0a1d
TN infer ( #2929 )
...
* en_small grammars added
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* infer fix
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* add whitelist arg
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* add input fall back
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* docstring
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-30 15:41:04 -07:00
tbartley94
d0c97aab6a
Itn fr ( #2921 )
...
* First commit. French ITN grammars for tagger and verbalizer. Test for French inverse_normalize added to tests. inverse_text_normalize updated to allow 'fr' tag. tools/text_processing/deployment/pynini_export.py updated to accept 'fr' tag. All CI tests for grammars passed.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Ran style checker.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Fixed bug causing ordinals to fail sparrowhawk test when verbalizing as roman numbers.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* style change for verbalizer/ordinal.py
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Delete test.py
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Cleaning up unused import spaces for lgtm check.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* taggers/time.py missed style checker
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* inverse_text_normalization/fr lacked an __init__ file
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Fixing copyright wording and adding whitelisting for titles.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Fixing copyright headers.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* copyright header change (missed whitelist)
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Edited export_grammars.sh notes to include 'fr'. Made verbalizer/decimal.py rewrite class part of main class instead.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* Adjusting copyright headers for tests.
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* inverse_text_normalization/fr/__init__ copyright header
Signed-off-by: tbartley94 <tbartley@nvidia.com>
* addint __init__ file to fr/data
Signed-off-by: tbartley94 <tbartley@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
2021-09-30 14:00:28 -07:00
Yang Zhang
6d7f1a5339
add fix to not add dot everywhere ( #2885 )
...
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-09-29 11:22:24 -07:00
PeganovAnton
e524be390d
Fix several bugs in punctuation and capitalization inference and make minor improvements ( #2905 )
...
* Add save labels arg to method and remove device setting
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix device bug and reading plain text bug
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Make minor improvements
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Remove excess parameter
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
2021-09-29 14:21:56 +03:00
Eric Harper
58bc1d2c6c
Merge r1.4 bugfixes to main ( #2918 )
...
* update package info
Signed-off-by: ericharper <complex451@gmail.com>
* update branch for jenkinsfile and dockerfile
Signed-off-by: ericharper <complex451@gmail.com>
* Adding conformer-transducer models. (#2717 )
* added the models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added contextnet models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added german and chinese models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fix the abs_pos of conformer. (#2863 )
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* update to match sde (#2867 )
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* updated german ngc model (#2871 )
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* Lower bound PTL to safe version (#2876 )
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update notebooks with onnxruntime (#2880 )
Signed-off-by: smajumdar <titu1994@gmail.com>
* Upperbound PTL (#2881 )
Signed-off-by: smajumdar <titu1994@gmail.com>
* minor typo and broken link fixes (#2883 )
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* Remove numbers from TTS tutorial names (#2882 )
* Remove numbers from TTS tutorial names
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
* Update documentation links
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
* Typos (#2884 )
* segmentation tutorial fix
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* data fixes
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* updated the messages in eval_beamsearch_ngram.py. (#2889 )
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* style (#2890 )
Signed-off-by: Jason <jasoli@nvidia.com>
* Fix broken link (#2891 )
* fix broken link
Signed-off-by: fayejf <fayejf07@gmail.com>
* more
Signed-off-by: fayejf <fayejf07@gmail.com>
* Update sclite eval for new transcription method (#2893 )
* Update sclite to use updated inference
Signed-off-by: smajumdar <titu1994@gmail.com>
* Remove WER
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update sclite script to use new inference methods
Signed-off-by: smajumdar <titu1994@gmail.com>
* Remove hub 5
Signed-off-by: smajumdar <titu1994@gmail.com>
* Fix TransformerDecoder export - r1.4 (#2875 )
* export fix
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* embedding pos
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* remove bool param
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
* changes
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
* Update Finetuning notebook (#2906 )
* update notebook
Signed-off-by: Jason <jasoli@nvidia.com>
* rename
Signed-off-by: Jason <jasoli@nvidia.com>
* rename
Signed-off-by: Jason <jasoli@nvidia.com>
* revert branch to main
Signed-off-by: ericharper <complex451@gmail.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-09-28 20:13:55 -06:00
Joaquin Anton
c88cfc42eb
Add DALI dataset unit test ( #2904 )
...
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
2021-09-28 12:46:25 -07:00
Micha Livne
3d678dbff1
Nmt encoder decoder hidden size fix ( #2856 )
...
* 1. Enabled encoder/decoder with different size in bottleneck architecture.
2. Validating encoder/decoder with the same size in non-bottleneck parent class.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed typo.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added hidden_size ot error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing defaults.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixing CI tests to have same hidden_size.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated error message.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating Jenkins CI test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating CI to hidden=48
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size when loading pre-trained huggingface model.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing hidden_size in config for pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated missng hidden_size in config.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Testing encoder and decoder objects' hidden_size instead of config to support pre-trained models.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updated Jenkinsfile test values.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed Jenkinsfile test values (NMT Megatron Model Parallel Size 2 Encoder)
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Updating missing arguments for Jenkinsfile test.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-09-28 10:08:24 -06:00
Vitaly Lavrukhin
0c7fbad290
Updated docs ( #2911 )
...
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
2021-09-27 21:13:57 -07:00
Evelina
fd3b4552a0
typos ( #2909 )
...
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-27 17:31:11 -07:00
Carol Anderson
9d83a1893b
add zero shot intent model ( #2861 )
...
* add zero shot intent model
Signed-off-by: Carol Anderson <carola@nvidia.com>
* update copyright headers for zero shot
Signed-off-by: Carol Anderson <carola@nvidia.com>
* fix typos in zero shot tutorial
Signed-off-by: Carol Anderson <carola@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-09-27 15:41:09 -07:00
Evelina
d08f1dc91d
format update ( #2901 )
...
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-26 14:38:07 -07:00
Vitaly Lavrukhin
5e51840ed5
SDE Updates ( #2900 )
...
* Removed text keywords from filters in SDE (to support as values)
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
* Added signal metrics to SDE
Added SDE histograms for all numeric attributes
Improved SDE UI
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
* Updated code style
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
* Updated SDE requirements
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
* Updated docs (SDE + minor fixes)
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
* Updated docs
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
2021-09-26 12:26:09 -07:00
Evelina
8cf9aad8ec
update readme with the tools sections ( #2895 )
...
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-24 21:44:14 -07:00
Micha Livne
3f6aee0433
1. Updated Jenkinsfile hidden_size. ( #2892 )
...
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
2021-09-24 16:10:19 -06:00
Evelina
ed2005eda9
TN/ITN update ( #2854 )
...
* from file added for all modes
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* directions map
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* decoder eval
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* separate eval and inference added
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* handle unk tokens and proper pre-post processing
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* clean up
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* review feedback
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-22 13:44:24 -07:00
Joaquin Anton
ad75db34c8
Fix DALI log default floor and normalization formula ( #2869 )
...
* Fix DALI log default floor and normalization formula
Signed-off-by: Joaquin Anton <janton@nvidia.com>
* Fix style
Signed-off-by: Joaquin Anton <janton@nvidia.com>
2021-09-22 11:29:25 -07:00
Joaquin Anton
6987b6cfc9
Fix window_stride calculation in DALI pipeline & Fix dither generation ( #2858 )
...
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
2021-09-21 13:17:29 -07:00
PeganovAnton
660e401db5
Feat/punctuation capitalization/long queries signoff ( #2683 )
...
* Move files from long_queries branch
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* remove sys.path modification
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Update tests
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Remove unused imports
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Move all code to punctuate_capitalize.py
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix minor bug
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Improve help message
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Improve help message
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add docstrings and typing
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Add remark about default parameter values
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* refactor
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Refactor
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix typo
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix typo
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix script name
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix script name
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
* Fix code style
Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-09-21 16:31:24 +03:00
Jason
08f060a80e
Remove file ( #2855 )
...
Signed-off-by: Jason <jasoli@nvidia.com>
2021-09-20 14:17:45 -04:00
Oktai Tatanov
3cde074436
New TTSDataset, tts tokenizers and g2ps ( #2792 )
...
* new vocabs and g2ps for tts
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix style
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts torch data
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update g2p modules, data and add example for tts vocabs
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix style
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add tokens field to tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add TTSDataset and docs for all of them
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix paths in yaml
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update test for tts dataset
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add heteronyms-030921 file to scripts folder
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* change requirements_torch_tts.txt
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* add tts_data_types.py
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* fix style tts_data_types.py
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update yaml and comments
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update cmu dict and tts ds config
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* remove unnecessary argument from tokenizers
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
* update test
Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
2021-09-20 16:20:12 +03:00
Nithin Rao
3f606194f2
Update model names ( #2845 )
...
* updated speaker model names
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
* update tutorial model names
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
2021-09-19 18:58:04 -07:00
Yang Zhang
b1e1494688
Tn punct train ( #2824 )
...
* add punct to tn inference and test
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* change code to train
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix error
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix error in tagger dataset
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix class based evaluation to print error
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix input processing
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* add lang to combine processed datasets
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-09-17 13:01:24 -07:00
Evelina
e443d71f28
pkl name fix ( #2843 )
...
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-17 10:34:34 -07:00
Nithin Rao
128b22d147
import fix ( #2821 )
...
Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
2021-09-17 09:39:16 -04:00
Somshubra Majumdar
bb4565c1b3
Update collection of pretrained models for RNNT ( #2837 )
...
* Update collection of pretrained models for RNNT
Signed-off-by: smajumdar <titu1994@gmail.com>
* Remove non-public Conformer MLS Medium
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-09-16 18:50:25 -07:00
Sandeep Subramanian
76a8459b93
New pretrained NMT model links ( #2836 )
...
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
2021-09-16 17:07:38 -07:00
Micha Livne
f4523c57ac
Max pooling encoder ( #2774 )
...
* 1. Added a max pooling encoder.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fied style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Removed unused imports.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added logging of log var q(z|x) for MIM and VAE.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added control parameter to use the mean of latent code (instead of samplng) during translation for MIM and VAE.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Added AveragePoolingEncoder (arch == "avg_pool") encoder.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Update documentation in YAML config.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed missing support for returning score during batch translation.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed format.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed projection of latent to decoder hidden.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Consolidated max and average pooling into a single class.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Debugging.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
* 1. Fixed style.
Signed-off-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Micha Livne <mlivne@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-09-16 17:05:11 -07:00
Somshubra Majumdar
5401e0fa27
Fix DALI error encountered with pad_to=0 ( #2827 )
...
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-09-16 14:18:45 -07:00
Evelina
bb39528f4f
tar dataset for TN/ITN ( #2826 )
...
* tar dataset added
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* typo and ci test
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* jenkins format
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* jenkins
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-16 10:33:56 -07:00
Vahid Noroozi
f96bf26a77
increased the precision of validation metric to be saved into the checkpoint file names. ( #2811 )
...
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Co-authored-by: Jason <jasoli@nvidia.com>
2021-09-16 10:03:00 -04:00
Elena Rastorgueva
aced0db13e
ITN Spanish ( #2489 )
...
* add Spanish ITN for cardinals and decimals (currently displaces English rules)
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Refactor ITN so English and Spanish code is side by side
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN rules for electronic
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN rules for money
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN rules for money
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Apply simple style fixes
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN rules for ordinals
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix 'doscientos' typo
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN rules for telephone numbers
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Apply style fixes
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix bug (NEMO_CHAR was being modified)
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN rules for time
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Apply style fixes
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Move ITN utils to language-specific folder
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Make separate test script folders for each language
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Make Cardinal class not convert numbers less than 10
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN Date rules
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Rename variables in Time rules
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Apply style fixes
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ITN WhiteList rules
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Apply style fixes
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Word test cases
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Change Ordinal 'suffix' to 'morphosyntactic_features'
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish to Sparrowhawk test scripts
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Remove unused imports
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Allow decimals to have a punto as well as a coma
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix typos
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Spanish ClassifyFst caching
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix Money class bug
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add Es Cardinal rules up to one septillionn, still ignoring 'y' in cardinals
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix Cardinal bug which inserted extra zeros
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix decimal rules bug
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add more Ordinal cases and don't convert ordinals less thathan 10
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add more units to MeasureFst
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Added currencies to Money class
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Make er ending in Ordinals be superscript
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add TimeFst tagger comments
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Update headers
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add missing __init__.py file
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* dco fix for Elena's branch
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* Fix Darg name in docstring
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Update headers
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Update TelephoneFst tagger docstring
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Make ElectronicFst also convert URLs
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix cardinal bug which converted e.g. ,uno to ,1
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Add cache_dir to CI tests
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Install numba=0.54
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Install numba=0.54.0
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Install numba==0.53.1
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix ru -> es typo in CI tests
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
* Fix typo in CI test
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Yang Zhang <yangzhang@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-09-16 01:01:50 -07:00
Somshubra Majumdar
3a08a3ff8f
Update ContextNet RNNT configs ( #2819 )
...
* Fix pretrained model info for zh models
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update ContextNet configs
Signed-off-by: smajumdar <titu1994@gmail.com>
* Update ContextNet configs
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-09-15 15:46:19 -07:00
Somshubra Majumdar
a0dc5b5912
Enforce numba compat ( #2823 )
...
* Enforce numba compat
Signed-off-by: smajumdar <titu1994@gmail.com>
* Remove all RNNT tests temporarily
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-09-15 14:06:52 -07:00
Vahid Noroozi
8f88a56327
Add conformer transducer configs ( #2812 )
...
* increased the precision of validation metric to be saved into the checkpoint file names.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added the configs for the conformer-transducer models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* added the configs for the conformer-transducer models.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
* fixed type.
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
2021-09-14 15:47:48 -07:00
Somshubra Majumdar
a33ec491c2
Temporarily disable numba cuda tests from running ( #2820 )
...
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-09-14 13:19:52 -07:00
Yang Zhang
f94608ab4d
Tn fix bugs ( #2815 )
...
* explicitly set weight to choose deterministic rule, important for SH
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix whitelist test case
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* added more symbols support for itn electronic
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* adding url to itn
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* prevent case where single cardinal, e.g. 4 without suffix is recognized as time
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* date does not accept standalone month anymore
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* style fix
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* add decimalx to measure
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* cardinal times
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* cardinal times
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* add updated en grammars
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* fix ci
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
* comment out tn with audio tests
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-09-14 10:36:27 -07:00
Vahid Noroozi
8b1c6e7b6d
Added support for HF pretrained models. Fixed the docs. ( #2658 )
2021-09-14 00:27:25 -07:00
Evelina
fb6b3b83b6
non-deterministic norm update ( #2787 )
...
* update script for large files
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* write intermediate result to a file
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* file renamed
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* expose n_jobs arg
Signed-off-by: ekmb <ebakhturina@nvidia.com>
* new grammars
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-09-13 18:50:21 -07:00
Taejin Park
f71ee4f08b
Update ASR_with_SpeakerDiarization.ipynb tutorial ( #2800 )
...
* Initial manuscript for ASR with diarization tutorial
Signed-off-by: Taejin Park <tango4j@gmail.com>
* Updated ASR_with_SpeakerDiarization.ipynb tutorial notebook
Signed-off-by: Taejin Park <tango4j@gmail.com>
* typo and minor fix
Signed-off-by: fayejf <fayejf07@gmail.com>
* Made minor cell order changes.
Signed-off-by: Taejin Park <tango4j@gmail.com>
Co-authored-by: fayejf <fayejf07@gmail.com>
2021-09-13 14:04:44 -07:00