Commit graph

508 commits

Author SHA1 Message Date
tbartley94 1106ff93c0
WFST_tutorial for ITN development (#3128)
* Pushing WFST_tutorial for open draft. (Still need to review collab code.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Checked tutorial code for WFST_Tutorial is properly functioning. Also included some formatting edits.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Responding to editorial comments for WFST_tutorial

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Added images to folder and wrote README for tutorials

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Few more editorial changes to explain permutations in classification.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Updated tutorials documentation page.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Forgot links for README

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* TOC links were dead

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* More dead links to fix.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* removing collab install and appending a warning instead.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Update WFST_Tutorial.ipynb

Signed-off-by: tbartley94 <tbartley@nvidia.com>
2021-11-09 12:18:19 -08:00
Yang Zhang 875f54464a
update english tn ckpt (#3143)
* update english tn ckpt

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* remove ununsed import

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-11-05 11:24:47 -07:00
Yang Zhang 3fe7308a37
Tn add nn wfst and doc (#3135)
* made tagger exportable

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* added whitelist wfst for nn

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated documentation

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* remove experimental

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* made tagger exportable

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* added whitelist wfst for nn

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated documentation

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* remove experimental

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated doc

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* preserve punct after nn wfst

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-11-04 11:48:26 -07:00
Somshubra Majumdar 9f99918974
Add Transducer documentation (#3015)
* Add RNNT documentation

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert unnecessary changes

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update docs for RNNT

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-10-21 10:40:11 -07:00
Eric Harper 91fd9ea970
Merge final doc and bug fixes from r1.4.0 to main (#2952)
* update branch for jenkinsfile and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* Typos (#2884)

* segmentation tutorial fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* data fixes

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Minor Fixes (#2922)

* typo

Signed-off-by: Jason <jasoli@nvidia.com>

* remove notebook from docs

Signed-off-by: Jason <jasoli@nvidia.com>

* Adding Conformer-Transducer docs. (#2920)

* added Conformer-Transducer docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Added contextnet.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed the title.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Fix numba spec augment for cases where batch size > MAX_THREAD_BUFFER (#2924)

* Fix numba spec augment for cases where batch size > MAX_THREAD_BUFFER

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert print in test

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update readme for r1.4.0 (#2927)

* Updated readme for r1.4.0.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Updated readme for r1.4.0.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Updated readme for r1.4.0.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Updated readme for r1.4.0.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Updated readme for r1.4.0.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Updated readme for r1.4.0.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Updated readme for r1.4.0.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* New NMT Models (#2925)

* New pretrained models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update NMT docs

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* revert

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-10-06 08:21:54 -06:00
Eric Harper 58bc1d2c6c
Merge r1.4 bugfixes to main (#2918)
* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* update branch for jenkinsfile and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* Adding conformer-transducer models. (#2717)

* added the models.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* added contextnet models.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* added german and chinese models.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fix the abs_pos of conformer. (#2863)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* update to match sde (#2867)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* updated german ngc model (#2871)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Lower bound PTL to safe version (#2876)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update notebooks with onnxruntime (#2880)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Upperbound PTL (#2881)

Signed-off-by: smajumdar <titu1994@gmail.com>

* minor typo and broken link fixes (#2883)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Remove numbers from TTS tutorial names (#2882)

* Remove numbers from TTS tutorial names

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Update documentation links

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* Typos (#2884)

* segmentation tutorial fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* data fixes

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* updated the messages in eval_beamsearch_ngram.py. (#2889)

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* style (#2890)

Signed-off-by: Jason <jasoli@nvidia.com>

* Fix broken link (#2891)

* fix broken link

Signed-off-by: fayejf <fayejf07@gmail.com>

* more

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update sclite eval for new transcription method (#2893)

* Update sclite to use updated inference

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove WER

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update sclite script to use new inference methods

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove hub 5

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix TransformerDecoder export - r1.4 (#2875)

* export fix

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* embedding pos

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* remove bool param

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

* changes

Signed-off-by: Abhinav Khattar <aklife97@gmail.com>

Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Update Finetuning notebook (#2906)

* update notebook

Signed-off-by: Jason <jasoli@nvidia.com>

* rename

Signed-off-by: Jason <jasoli@nvidia.com>

* rename

Signed-off-by: Jason <jasoli@nvidia.com>

* revert branch to main

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Abhinav Khattar <aklife97@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-09-28 20:13:55 -06:00
Vitaly Lavrukhin 0c7fbad290
Updated docs (#2911)
Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
2021-09-27 21:13:57 -07:00
Evelina fd3b4552a0
typos (#2909)
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-27 17:31:11 -07:00
Evelina d08f1dc91d
format update (#2901)
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-26 14:38:07 -07:00
Vitaly Lavrukhin 5e51840ed5
SDE Updates (#2900)
* Removed text keywords from filters in SDE (to support as values)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>

* Added signal metrics to SDE
Added SDE histograms for all numeric attributes
Improved SDE UI

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>

* Updated code style

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>

* Updated SDE requirements

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>

* Updated docs (SDE + minor fixes)

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>

* Updated docs

Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com>
2021-09-26 12:26:09 -07:00
Evelina 8cf9aad8ec
update readme with the tools sections (#2895)
Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-24 21:44:14 -07:00
PeganovAnton 660e401db5
Feat/punctuation capitalization/long queries signoff (#2683)
* Move files from long_queries branch

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* remove sys.path modification

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Update tests

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Remove unused imports

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Move all code to punctuate_capitalize.py

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix minor bug

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve help message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Improve help message

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add docstrings and typing

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Add remark about default parameter values

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* refactor

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Refactor

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix typo

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix script name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix script name

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

* Fix code style

Signed-off-by: PeganovAnton <peganoff2@mail.ru>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-09-21 16:31:24 +03:00
Nithin Rao 3f606194f2
Update model names (#2845)
* updated speaker model names

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update tutorial model names

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
2021-09-19 18:58:04 -07:00
Evelina bb39528f4f
tar dataset for TN/ITN (#2826)
* tar dataset added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* typo and ci test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-09-16 10:33:56 -07:00
Vahid Noroozi 8b1c6e7b6d
Added support for HF pretrained models. Fixed the docs. (#2658) 2021-09-14 00:27:25 -07:00
Kiran Scaria ca2425d422
fixed typo (#2796)
Signed-off-by: kiranscaria <kiranscaria@outlook.com>
2021-09-09 00:57:32 -07:00
Nithin Rao 0aa5b4526a
Move speaker folders (#2777)
* initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

change folder

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Create README.md

initial diar readme

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

small_fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Update README.md

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add diarization README

initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

initial push

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

readme

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

scp_manifest

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

rebase and move folders

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

updated scp to manifest script

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

add recogniton read me

tutorial update

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Updated README.md 001

Updated README.md and committing for saving purpose

Update README.md

conf changes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Update README.md 002

Added examples for input and output.

Added diarization_utils.py and asr_with_diarization.py

Signed-off-by: Taejin Park <tango4j@gmail.com>

slight changes diarization

oracle null and style --fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Reflected LGTM comments.

Signed-off-by: Taejin Park <tango4j@gmail.com>

reflected changes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

remove duplicate seeds

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Reflected PR review and removed unused variables

Signed-off-by: Taejin Park <tango4j@gmail.com>

Update README.md 003

Added a few titles and revised the descriptions.

Update README.md 003

Added a few titles and revised the descriptions.

Signed-off-by: Taejin Park <tango4j@gmail.com>

scripts and tutorial link fixes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

LGTM fixes

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

Added more docstrings and reused get_DER

Signed-off-by: Taejin Park <tango4j@gmail.com>

style fix

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* update ecapa config

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>
2021-09-08 20:58:08 -07:00
Eric Harper 234e496fcf
Merge final bugfix r1.3.0 (#2749)
* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update notebooks branch

Signed-off-by: ericharper <complex451@gmail.com>

* Replaced unfold() with split_view() (#2671)

* Replaced unfold() with split_view()

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* fixed typo

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* Fix issues with ASR notebooks (#2698)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Allow non divisible split_size (#2699)

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* Fix the feat_out param. (#2714)

* broken link fix (#2720)

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* rename (#2721)

Signed-off-by: fayejf <fayejf07@gmail.com>

* apply fix (#2726)

Signed-off-by: Jason <jasoli@nvidia.com>

* [DOCS] Updating adobe and copyright for docs (#2740)

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update

Signed-off-by: ericharper <complex451@gmail.com>

* update notebook branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins test to use less memory

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins test to use less memory

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
2021-08-31 11:52:03 -06:00
Yang Zhang c24e428564
correcting some typos (#2741)
Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
2021-08-27 18:24:51 -07:00
Eric Harper 2ff89fdf56
Merge 1.3 bugfixes into main (#2715)
* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update notebooks branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update nemo version for Dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* update notebook branch

Signed-off-by: ericharper <complex451@gmail.com>

* Update colab links to Transducer notebooks (#2654)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix nmt grpc server, concatdataset for raw text files (#2656)

* Fix nmt grpc server and concatdataset for raw text files

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Check if lang direction is provided correctly

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* add missing init (#2662)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix qa inference for single example (#2668)

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* Fix max symbol per step updating for RNNT (#2672)

* Fix max symbol per step updating for RNNT

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fix notebooks

Signed-off-by: smajumdar <titu1994@gmail.com>

* Replaced unfold() with split_view() (#2671)

* Replaced unfold() with split_view()

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* fixed typo

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* Correct voice app demo (#2682)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Import guard (#2692)

* add asr and pynini import guard

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove asrmodel type

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove asrmodel type

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fixing branch (#2695)

Signed-off-by: Ghasem Pasandi <gpasandi@nvidia.com>

Co-authored-by: Ghasem Pasandi <gpasandi@nvidia.com>

* fix for emojis (#2675)

* fix for emojis

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove redundant line

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* raise error

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* use app_state

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Fix issues with ASR notebooks (#2698)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Allow non divisible split_size (#2699)

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* bugfix

Signed-off-by: Jason <jasoli@nvidia.com>

* TN fix for corner cases (#2689)

* serial added, weights to common defaults, decimal bug fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* one failing

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* all tests pass

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove redundant file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix telephone, add test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* money fix

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix edge case of greedy decoding for greedy_batch mode (#2701)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove time macro (#2703)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Minor FastPitch Fixes (#2697)

* fixes

Signed-off-by: Jason <jasoli@nvidia.com>

* update CI

Signed-off-by: Jason <jasoli@nvidia.com>

* refix

Signed-off-by: Jason <jasoli@nvidia.com>

* Fix ddp error. (#2678)

To avoid "MisconfigurationException: Selected distributed backend ddp is not compatible with an interactive environment." error.

Co-authored-by: ekmb <ebakhturina@nvidia.com>

* update jenkins

Signed-off-by: ericharper <complex451@gmail.com>

* update notebooks

Signed-off-by: ericharper <complex451@gmail.com>

* add split_view back

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Ghasem <35242805+pasandi20@users.noreply.github.com>
Co-authored-by: Ghasem Pasandi <gpasandi@nvidia.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-08-24 16:21:59 -06:00
Aleksey Grinchuk (Oleksii Hrinchuk) 94126c4b65
Language model training docs (#2644)
* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* ddp translate GPU allocation fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* map_location instead of set_device

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* lm training docs initial commit

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* updated docs

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* final fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-08-14 15:18:10 +02:00
fayejf 7f01d2a51a
VAD postprocessing - binarization, filtering (#2636)
* Add more paramter for binarization

Signed-off-by: fayejf <fayejf07@gmail.com>

* more

Signed-off-by: fayejf <fayejf07@gmail.com>

* optimze and update diarizer

Signed-off-by: fayejf <fayejf07@gmail.com>

* updates

Signed-off-by: fayejf <fayejf07@gmail.com>

* binarization and order of filtering for plot

Signed-off-by: fayejf <fayejf07@gmail.com>

* updates

Signed-off-by: fayejf <fayejf07@gmail.com>

* add dur offset for plotting

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix and clean

Signed-off-by: fayejf <fayejf07@gmail.com>

* style fix and updates

Signed-off-by: fayejf <fayejf07@gmail.com>

* rename

Signed-off-by: fayejf <fayejf07@gmail.com>

* small udpates

Signed-off-by: fayejf <fayejf07@gmail.com>

* typo

Signed-off-by: fayejf <fayejf07@gmail.com>

* fix lgtm

Signed-off-by: fayejf <fayejf07@gmail.com>

* add back support threshold

Signed-off-by: fayejf <fayejf07@gmail.com>

* update yaml to be of ch109

Signed-off-by: fayejf <fayejf07@gmail.com>

* style fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* update Speaker_Diarization_Inference.ipynb

Signed-off-by: fayejf <fayejf07@gmail.com>

* updates Online_Offline_Microphone_VAD_Demo.ipynb

Signed-off-by: fayejf <fayejf07@gmail.com>

* update config in sd rst

Signed-off-by: fayejf <fayejf07@gmail.com>

* updated from feedback

Signed-off-by: fayejf <fayejf07@gmail.com>

* updates

Signed-off-by: fayejf <fayejf07@gmail.com>

* update yaml about dev

Signed-off-by: fayejf <fayejf07@gmail.com>

* add missing discription in docstring

Signed-off-by: fayejf <fayejf07@gmail.com>

* typo and dir name

Signed-off-by: fayejf <fayejf07@gmail.com>

* style fix

Signed-off-by: fayejf <fayejf07@gmail.com>
2021-08-12 21:04:03 -07:00
Tuan Manh Lai 4abe5d5f6d
Add back tagger data augmentation + Fixes for analyze_errors.py (#2637)
* Fixes to analyze_errors.py
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add back tagger data augmentation
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add duplex neural tn to README
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Fixed typos
Signed-off-by: Tuan Lai <tuanl@nvidia.com>
2021-08-11 10:16:49 -07:00
Somshubra Majumdar c9d04851e8
Rename ASR tutorials (#2633)
* Add README.md

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 01_ASR_with_NeMo.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 02_Online_ASR_Microphone_Demo.ipynb and 03_Speech_Commands.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 04_Online_Offline_Speech_Commands_Demo.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 05_Online_Noise_Augmentation.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 06_Voice_Activity_Detection.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 07_Online_Offline_Microphone_VAD_Demo.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 08_ASR_with_Subword_Tokenization.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor 10_ASR_CTC_Language_Finetuning.ipynb

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add transducer image

Signed-off-by: smajumdar <titu1994@gmail.com>
2021-08-10 12:58:26 -07:00
Tuan Manh Lai 7e6197d33d
Fixed an error related to the task indicator of the tagger during inference. (#2627)
* A minor fix to _infer() of DuplexTaggerModel
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Remove tagger data augmentation
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Minor fixes to cache path
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Style fix
Signed-off-by: Tuan Lai <tuanl@nvidia.com>
2021-08-09 14:32:11 -07:00
Eric Harper 0349415662
Merge bugfixes and doc updates from 1.2.0 to main (#2533)
* update jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update BRANCH

Signed-off-by: ericharper <complex451@gmail.com>

* update package_info.py

Signed-off-by: ericharper <complex451@gmail.com>

* Update Dockerfile numba install for 21.06 (#2515)

* update Dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* update Dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* upper bound ptl 1.4 (#2517)

Signed-off-by: ericharper <complex451@gmail.com>

* Typo correction in asr streaming tutorial (#2520)

* Corrected typos

Signed-off-by: jbalam <jbalam@nvidia.com>

* datalayer->data layer

Signed-off-by: jbalam <jbalam@nvidia.com>

* Jarvis to Riva changes for doc 1.2.0 (#2521)

* Change Jarvis to Riva in export.rst (#2529)

Signed-off-by: Herb Kelly <hkelly@nvidia.com>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update version

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
Co-authored-by: hkelly33 <58792115+hkelly33@users.noreply.github.com>
2021-07-22 09:31:38 -06:00
Tuan Manh Lai b472670afa
Extending the neural TN/ITN models for other languages (#2497)
* extending the neural TN/ITN model to handle RU

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Support German
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Catch AttributeError instead of BaseException
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Style fix
Signed-off-by: Tuan Lai <tuanl@nvidia.com>
2021-07-21 09:22:53 -07:00
Jason 846b150082
Update TTS Docs to recommend fastpitch and hifigan (#2498)
* update docs

Signed-off-by: Jason <jasoli@nvidia.com>

* update

Signed-off-by: Jason <jasoli@nvidia.com>
2021-07-19 16:30:35 -07:00
vadam5 e3f6867dd2
Entity linking documentation (#2357)
* Update tutorials.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Update tutorials.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Update models.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Add files via upload

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Create entity_linking.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Update README.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Update entity_linking.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Update nlp_all.bib

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Update entity_linking.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Update entity_linking.rst

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* fixed base typos and doc link

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
2021-07-19 16:10:19 -07:00
Jagadeesh Balam f215cdb65b
New ASR tutorial for streaming ASR (#2485)
* Added urls to mini librispeech data

Signed-off-by: jbalam <jbalam@nvidia.com>

* Added tutorial for streaming ASR decoder

Signed-off-by: jbalam <jbalam@nvidia.com>

* Added tutorial info to docs

Signed-off-by: jbalam <jbalam@nvidia.com>

* Made changes suggested in the review

Signed-off-by: jbalam <jbalam@nvidia.com>

* Made changes suggested in the review

Signed-off-by: jbalam <jbalam@nvidia.com>
2021-07-15 00:20:45 -07:00
Yang Zhang ed085459c9
refactor text processing ONly code to allow other languages (#2477)
* refactor text processing ONly code to allow other languages

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* refactored test folder structure to divide between languages

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* updated docs

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* add missing file

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix lgtm

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

Co-authored-by: ekmb <ebakhturina@nvidia.com>
2021-07-14 08:17:22 -07:00
Soumye Singhal caf115dde0
Indic Tokenizer boilerplate (#2345)
* Indic Tokernizer boilerplate

* nmt docs fix

* python style fix

Signed-off-by: Soumye Singhal <singhalsoumye@gmail.com>

* typo

Signed-off-by: Soumye Singhal <singhalsoumye@gmail.com>

* Style fixes after merge conflict

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
2021-07-08 13:00:01 -06:00
Tuan Manh Lai 9c1b887172
Transformer-based Text Normalization Models (#2415)
* Add notebook with recommendations for 8 kHz speech (#2326)

* Added a notebook with best practices for telephony speech

* Added datasets detaiils

* Added training recommendations

* Emptied out cells with results

* Added tutorial to docs

Signed-off-by: jbalam <jbalam@nvidia.com>

* Addressed review comments

Signed-off-by: jbalam <jbalam@nvidia.com>

* Added a line to note original sampling rate of an4

Signed-off-by: jbalam <jbalam@nvidia.com>

* Made changes suggested in review

Signed-off-by: jbalam <jbalam@nvidia.com>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add FastEmit support for RNNT Losses (#2374)

* Temp commit

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial code for fastemit forward pass

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct return reg value

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial cpu impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Try gpu impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Try gpu impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct few impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update fastemit scaling

Signed-off-by: smajumdar <titu1994@gmail.com>

* Cleanup fastemit

Signed-off-by: smajumdar <titu1994@gmail.com>

* Finalize FastEmit regularization PR

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor code to support fastemit regularization

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Implement inference functions of TN models

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Minor Fix

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* fix bugs in hifigan code (#2392)

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Update setup.py (#2394)

Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* update checkpointing (#2396)

Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* byt5 unicode implementation (#2365)

* Audio Norm (#2285)

* add jenkins test, refactoring

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix new test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* manifest test added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* expose more params, new test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* addressed review comments

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix decimal in measure

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* move serial in cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* clean up

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update for SH zero -> oh

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* change n_tagger default

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* bumping version to 1.0.1

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Add check for numba regardless of device

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* upper bound for webdataset

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Correct Dockerfile

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* update readmes

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* update README (#2332)

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* ddp translate GPU allocation fix (#2312)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* ddp translate GPU allocation fix

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* map_location instead of set_device

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Shallow fusion (#2315)

* fixed branch in IR tutorial

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* shallow fusion init commit

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

* debug info removed

Signed-off-by: AlexGrinch <grinchuk.alexey@gmail.com>

Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* [BUGFIX] Add upper bound to hydra for 1.0.x (#2337)

* upper bound hydra

Signed-off-by: ericharper <complex451@gmail.com>

* upper bound hydra

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* update version number

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* update package version

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* sparrowhawk tests + punctuation post processing for pynini TN (#2320)

* add jenkins test, refactoring

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix new test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* manifest test added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* expose more params, new test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* addressed review comments

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix decimal in measure

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* move serial in cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* sh tests init

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* sparrowhawk container tests support added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add post process to normalize.py, update tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove duplication

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update notebooks to 1.0.2 release (#2338)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update ranges for omegaconf and hydra (#2336)

* Update ranges

Signed-off-by: smajumdar <titu1994@gmail.com>

* Updates for Hydra and OmegaConf updates

Signed-off-by: smajumdar <titu1994@gmail.com>

* Style fixes

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct tests and revert patch for model utils

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct docstring

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert unnecessary change

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert unnecessary change

Signed-off-by: smajumdar <titu1994@gmail.com>

* Guard scheduler for None

Signed-off-by: smajumdar <titu1994@gmail.com>

* default to 0.0 if bpe_dropout is None

Signed-off-by: ericharper <complex451@gmail.com>

* Correctly log class that was restored

Signed-off-by: smajumdar <titu1994@gmail.com>

* Root patch *bpe_dropout

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: ericharper <complex451@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update FastPitch Export (#2355)

Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* byt5 unicode implementation, first cut

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* add bytelevel tokenizer

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* update out_dir to not collide (#2358)

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update container version to 21.05 (#2309)

* Update container version

Signed-off-by: smajumdar <titu1994@gmail.com>

* Temporarily change export format of waveglow

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add conda update for numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct order of numba minimum verion, remove wrong flag from test

Signed-off-by: smajumdar <titu1994@gmail.com>

* Double test of cuda numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Double test of cuda numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Enable RNNT tests

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Text Normalization Update (#2356)

* upper cased date support

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update whitelist, change roman weights

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* docstrings, space fix, init file

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* lgtm

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fraction with measure class

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* address comment

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Add ASR CTC tutorial on fine-tuning on another language (#2346)

* Add ASR CTC Language finetuning notebook

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add to documentation

Signed-off-by: smajumdar <titu1994@gmail.com>

* Improve documentation

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct name of the dataset

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Correct colab link to notebook (#2366)

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* sgdqa update data directories for testing (#2323)

* sgdqa update data directories for testing

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix syntax

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* check if data dir exists

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* fix

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>

* adding pretrained model

Signed-off-by: Yang Zhang <yangzhang@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Added documentation for export() (#2330)

* Added export document

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Addressed review comments

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update Citrinet model card info (#2369)

* Update model card info

Signed-off-by: smajumdar <titu1994@gmail.com>

* Cleanup Docs

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* [NMT] Model Parallel Megatron Encoders (#2238)

* add megatron encoder

Signed-off-by: ericharper <complex451@gmail.com>

* added megatron to get_nmt_tokenizer

Signed-off-by: ericharper <complex451@gmail.com>

* add vocab_size and hidden_size to megatron bert

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron encoder module

Signed-off-by: ericharper <complex451@gmail.com>

* fixed horrible typo

Signed-off-by: ericharper <complex451@gmail.com>

* fix typo and add default

Signed-off-by: ericharper <complex451@gmail.com>

* updating nlp overrides for mp nmt

Signed-off-by: ericharper <complex451@gmail.com>

* move some logic back to nlpmodel from overrides

Signed-off-by: ericharper <complex451@gmail.com>

* add checkpoint_file property

Signed-off-by: ericharper <complex451@gmail.com>

* fix property

Signed-off-by: ericharper <complex451@gmail.com>

* num_tokentypes=0

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* find_unused_parameters=True

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* get instead of pop

Signed-off-by: ericharper <complex451@gmail.com>

* remove token type ids from megatron input example

Signed-off-by: ericharper <complex451@gmail.com>

* pop vocab_size

Signed-off-by: ericharper <complex451@gmail.com>

* fix checkpointing for model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* fix bug in non model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* convert cfg.trainer to dict

Signed-off-by: ericharper <complex451@gmail.com>

* make num_tokentypes configurable for nmt

Signed-off-by: ericharper <complex451@gmail.com>

* update checkpoint_file when using named megatron model in nemo

Signed-off-by: ericharper <complex451@gmail.com>

* make vocab_file configurable

Signed-off-by: ericharper <complex451@gmail.com>

* dataclass can't have mutable default

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* revert input example

Signed-off-by: ericharper <complex451@gmail.com>

* check that checkpoint version is not None

Signed-off-by: ericharper <complex451@gmail.com>

* add mp jenkins test

Signed-off-by: ericharper <complex451@gmail.com>

* update docstring

Signed-off-by: ericharper <complex451@gmail.com>

* add docs for pretrained encoders with nemo nmt

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Add notebook with recommendations for 8 kHz speech (#2326)

* Added a notebook with best practices for telephony speech

* Added datasets detaiils

* Added training recommendations

* Emptied out cells with results

* Added tutorial to docs

Signed-off-by: jbalam <jbalam@nvidia.com>

* Addressed review comments

Signed-off-by: jbalam <jbalam@nvidia.com>

* Added a line to note original sampling rate of an4

Signed-off-by: jbalam <jbalam@nvidia.com>

* Made changes suggested in review

Signed-off-by: jbalam <jbalam@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Add FastEmit support for RNNT Losses (#2374)

* Temp commit

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial code for fastemit forward pass

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct return reg value

Signed-off-by: smajumdar <titu1994@gmail.com>

* Initial cpu impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Try gpu impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Try gpu impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct few impl

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update fastemit scaling

Signed-off-by: smajumdar <titu1994@gmail.com>

* Cleanup fastemit

Signed-off-by: smajumdar <titu1994@gmail.com>

* Finalize FastEmit regularization PR

Signed-off-by: smajumdar <titu1994@gmail.com>

* Refactor code to support fastemit regularization

Signed-off-by: smajumdar <titu1994@gmail.com>

Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* byt5 unicode implementation, first cut

Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* add bytelevel tokenizer

Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* update styling

Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* avoid circular import

Signed-off-by: Mike Chrzanowski <mchrzanowski@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* fix bugs in hifigan code (#2392)

Signed-off-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update setup.py (#2394)

Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update bytelevel_tokenizer.py

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* Update bytelevel_tokenizer.py

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* typo

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* missed one

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* bug fixes

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* style fix

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* bytelevelprocessor is now generic.

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* style fix

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* update checkpointing (#2396)

Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* woops, didnt merge jenkinsfile the right way

* add newline

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* undo changes to enja processor

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* processor selection decision fix

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

* newline fix

Signed-off-by: mchrzanowski <mchrzanowski@nvidia.com>

Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: mchrzanowski <mchrzanowski@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: root <root@dgx0026.nsv.rno1.nvmetal.net>
Co-authored-by: root <root@dgx0079.nsv.rno1.nvmetal.net>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Minor Fix

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Minor Fixes

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add TextNormalizationTestDataset and testing/evaluation code

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add TextNormalizationTaggerDataset and training code for tagger

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Restore from local nemo ckpts

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add TextNormalizationDecoderDataset

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add interactive mode for neural_text_normalization_test.py

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add options to do training or not for tagger/decoder

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Renamed

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Implemented setup dataloader for decoder

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Implemented training and validation for decoder

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Data augmentation for decoder training

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Config change

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* add blossom-ci.yml (#2401)

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Merge r1.1 bugfixes into main (#2407)

* Update notebook branch and Jenkinsfile for 1.1.0 testing (#2378)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (#2380)

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* instantiate with NLPDDPPlugin with num_nodes from trainer config

Signed-off-by: ericharper <complex451@gmail.com>

* Update ASR scripts for tokenizer building and tarred dataset building (#2381)

* Update ASR scripts for tokenizer building and tarred dataset building

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update container

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add STT Zh Citrinet 1024 Gamma 0.25 model

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update notebook (#2391)

Signed-off-by: smajumdar <titu1994@gmail.com>

* ASR Notebooks fix for 1.1.0 (#2395)

* nb fix for spring clean

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove outdated instruction

Signed-off-by: fayejf <fayejf07@gmail.com>

* Mean normalization (#2397)

* norm embeddings

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* move to utils

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Bugfix adaptive spec augment time masking (#2398)

* bugfix adaptive spec augment

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove static time width clamping

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct typos and issues with notebooks (#2402)

* Fix Primer notebook

Signed-off-by: smajumdar <titu1994@gmail.com>

* Typo

Signed-off-by: smajumdar <titu1994@gmail.com>

* remove accelerator=DDP in tutorial notebooks to avoid errors. (#2403)

Signed-off-by: Hoo Chang Shin <hshin@nvidia.com>

Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkins branch

Signed-off-by: ericharper <complex451@gmail.com>

* update notebook branch to main

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Remove unused imports

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add initial doc for text_normalization

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Fixed imports warnings

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Minor Fix

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Renamed

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Allowed duplex modes

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Minor Fix

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add docs for duplex_text_normalization_train and duplex_text_normalization_test

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* docstrings for model codes + minor fix

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add more comments and doc strings

Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add doc for datasets + Use time.perf_counter()
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add code for preprocessing Google TN data
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add more docs and comments + Minor Fixes
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add more licenses + Fixed comments + Minors
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Moved evaluation logic to DuplexTextNormalizationModel
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add logging errors
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Updated validation code of tagger + Minors
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Also write tag preds to log file
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add data augmentation for tagger dataset
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Added experimental decorators
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Updated docs
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Updated duplex_tn_config.yaml
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Compute token precision of tagger using NeMo metrics
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Fixed saving issue when using ddp accelerator
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Refactoring
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Add option to keep punctuations in TextNormalizationTestDataset
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Changes to input preprocessing + decoder's postprocessing
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Fixed styles + Add references
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

* Renamed examples/nlp/duplex_text_normalization/utils.py to helpers.py
Signed-off-by: Tuan Lai <tuanl@nvidia.com>

Co-authored-by: Jagadeesh Balam <4916480+jbalam-nv@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Oktai Tatanov <oktai.tatanov@gmail.com>
Co-authored-by: Jason <jasoli@nvidia.com>
Co-authored-by: Mike Chrzanowski <mike.chrzanowski0@gmail.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
Co-authored-by: Aleksey Grinchuk (Oleksii Hrinchuk) <grinchuk.alexey@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: mchrzanowski <mchrzanowski@nvidia.com>
Co-authored-by: Yang Zhang <yzhang123@users.noreply.github.com>
Co-authored-by: Boris Fomitchev <borisfom@users.noreply.github.com>
Co-authored-by: root <root@dgx0026.nsv.rno1.nvmetal.net>
Co-authored-by: root <root@dgx0079.nsv.rno1.nvmetal.net>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>
2021-07-07 22:26:37 -07:00
Eric Harper c5dbf4508a
Merge r1.1 bugfixes to main. Update dep versions. (#2437)
* Update notebook branch and Jenkinsfile for 1.1.0 testing (#2378)

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* [BUGFIX] NMT Multi-node was incorrectly computing num_replicas (#2380)

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* fix property when not using model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* add debug statement

Signed-off-by: ericharper <complex451@gmail.com>

* instantiate with NLPDDPPlugin with num_nodes from trainer config

Signed-off-by: ericharper <complex451@gmail.com>

* Update ASR scripts for tokenizer building and tarred dataset building (#2381)

* Update ASR scripts for tokenizer building and tarred dataset building

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update container

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add STT Zh Citrinet 1024 Gamma 0.25 model

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update notebook (#2391)

Signed-off-by: smajumdar <titu1994@gmail.com>

* ASR Notebooks fix for 1.1.0 (#2395)

* nb fix for spring clean

Signed-off-by: fayejf <fayejf07@gmail.com>

* remove outdated instruction

Signed-off-by: fayejf <fayejf07@gmail.com>

* Mean normalization (#2397)

* norm embeddings

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* move to utils

Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com>

* Bugfix adaptive spec augment time masking (#2398)

* bugfix adaptive spec augment

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Revert freq mask guard

Signed-off-by: smajumdar <titu1994@gmail.com>

* Remove static time width clamping

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct typos and issues with notebooks (#2402)

* Fix Primer notebook

Signed-off-by: smajumdar <titu1994@gmail.com>

* Typo

Signed-off-by: smajumdar <titu1994@gmail.com>

* remove accelerator=DDP in tutorial notebooks to avoid errors. (#2403)

Signed-off-by: Hoo Chang Shin <hshin@nvidia.com>

Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>

* [BUGFIX] Megatron in NMT was setting vocab_file to None (#2417)

* make vocab_file configurable for megatron in nmt

Signed-off-by: ericharper <complex451@gmail.com>

* update docs

Signed-off-by: ericharper <complex451@gmail.com>

* update docs

Signed-off-by: ericharper <complex451@gmail.com>

* Link updates in docs and notebooks and typo fix (#2416)

* typo fix for notebooks

Signed-off-by: fayejf <fayejf07@gmail.com>

* tiny typo fix in docs

Signed-off-by: fayejf <fayejf07@gmail.com>

* docs branch->stable

Signed-off-by: fayejf <fayejf07@gmail.com>

* more docs branch -> stable

Signed-off-by: fayejf <fayejf07@gmail.com>

* tutorial links branch -> stable

Signed-off-by: fayejf <fayejf07@gmail.com>

* small fix

Signed-off-by: fayejf <fayejf07@gmail.com>

* add renamed 06

Signed-off-by: fayejf <fayejf07@gmail.com>

* more fixes

Signed-off-by: fayejf <fayejf07@gmail.com>

* Update onnx (#2420)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct version of onnxruntime (#2422)

Signed-off-by: smajumdar <titu1994@gmail.com>

* update deployment instructions (#2430)

Signed-off-by: ericharper <complex451@gmail.com>

* Bumping version to 1.1.0

Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>

* update jenksinfile

Signed-off-by: ericharper <complex451@gmail.com>

* add upper bounds

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update requirements

Signed-off-by: ericharper <complex451@gmail.com>

* update jenkinsfile

Signed-off-by: ericharper <complex451@gmail.com>

* update version

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>
Co-authored-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-07-02 14:22:44 -07:00
Jagadeesh Balam e070e04ea2
Add notebook with recommendations for 8 kHz speech (#2326)
* Added a notebook with best practices for telephony speech

* Added datasets detaiils

* Added training recommendations

* Emptied out cells with results

* Added tutorial to docs

Signed-off-by: jbalam <jbalam@nvidia.com>

* Addressed review comments

Signed-off-by: jbalam <jbalam@nvidia.com>

* Added a line to note original sampling rate of an4

Signed-off-by: jbalam <jbalam@nvidia.com>

* Made changes suggested in review

Signed-off-by: jbalam <jbalam@nvidia.com>
2021-06-17 20:00:41 -07:00
Eric Harper 5cfff20988
[NMT] Model Parallel Megatron Encoders (#2238)
* add megatron encoder

Signed-off-by: ericharper <complex451@gmail.com>

* added megatron to get_nmt_tokenizer

Signed-off-by: ericharper <complex451@gmail.com>

* add vocab_size and hidden_size to megatron bert

Signed-off-by: ericharper <complex451@gmail.com>

* add megatron encoder module

Signed-off-by: ericharper <complex451@gmail.com>

* fixed horrible typo

Signed-off-by: ericharper <complex451@gmail.com>

* fix typo and add default

Signed-off-by: ericharper <complex451@gmail.com>

* updating nlp overrides for mp nmt

Signed-off-by: ericharper <complex451@gmail.com>

* move some logic back to nlpmodel from overrides

Signed-off-by: ericharper <complex451@gmail.com>

* add checkpoint_file property

Signed-off-by: ericharper <complex451@gmail.com>

* fix property

Signed-off-by: ericharper <complex451@gmail.com>

* num_tokentypes=0

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* find_unused_parameters=True

Signed-off-by: ericharper <complex451@gmail.com>

* typo

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* get instead of pop

Signed-off-by: ericharper <complex451@gmail.com>

* remove token type ids from megatron input example

Signed-off-by: ericharper <complex451@gmail.com>

* pop vocab_size

Signed-off-by: ericharper <complex451@gmail.com>

* fix checkpointing for model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* fix bug in non model parallel

Signed-off-by: ericharper <complex451@gmail.com>

* convert cfg.trainer to dict

Signed-off-by: ericharper <complex451@gmail.com>

* make num_tokentypes configurable for nmt

Signed-off-by: ericharper <complex451@gmail.com>

* update checkpoint_file when using named megatron model in nemo

Signed-off-by: ericharper <complex451@gmail.com>

* make vocab_file configurable

Signed-off-by: ericharper <complex451@gmail.com>

* dataclass can't have mutable default

Signed-off-by: ericharper <complex451@gmail.com>

* style

Signed-off-by: ericharper <complex451@gmail.com>

* unused imports

Signed-off-by: ericharper <complex451@gmail.com>

* revert input example

Signed-off-by: ericharper <complex451@gmail.com>

* check that checkpoint version is not None

Signed-off-by: ericharper <complex451@gmail.com>

* add mp jenkins test

Signed-off-by: ericharper <complex451@gmail.com>

* update docstring

Signed-off-by: ericharper <complex451@gmail.com>

* add docs for pretrained encoders with nemo nmt

Signed-off-by: ericharper <complex451@gmail.com>
2021-06-16 20:32:33 -06:00
Somshubra Majumdar 2ce876259b
Update Citrinet model card info (#2369)
* Update model card info

Signed-off-by: smajumdar <titu1994@gmail.com>

* Cleanup Docs

Signed-off-by: smajumdar <titu1994@gmail.com>
2021-06-16 19:04:04 -06:00
Boris Fomitchev 0aa45241ac
Added documentation for export() (#2330)
* Added export document

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Addressed review comments

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>
2021-06-16 15:49:34 -06:00
Somshubra Majumdar cfe2548d4d
Correct colab link to notebook (#2366)
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-06-15 15:30:01 -07:00
Somshubra Majumdar c036cec3ac
Add ASR CTC tutorial on fine-tuning on another language (#2346)
* Add ASR CTC Language finetuning notebook

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add to documentation

Signed-off-by: smajumdar <titu1994@gmail.com>

* Improve documentation

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct name of the dataset

Signed-off-by: smajumdar <titu1994@gmail.com>
2021-06-15 14:02:41 -07:00
Somshubra Majumdar 3e94696e21
Update container version to 21.05 (#2309)
* Update container version

Signed-off-by: smajumdar <titu1994@gmail.com>

* Temporarily change export format of waveglow

Signed-off-by: smajumdar <titu1994@gmail.com>

* Add conda update for numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update numba compat via global flag for strictness level `--relax_numba_compat`, remove pytorchlightning.metrics, refactor out numba utils to core, update tests

Signed-off-by: smajumdar <titu1994@gmail.com>

* Correct order of numba minimum verion, remove wrong flag from test

Signed-off-by: smajumdar <titu1994@gmail.com>

* Double test of cuda numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Double test of cuda numba

Signed-off-by: smajumdar <titu1994@gmail.com>

* Enable RNNT tests

Signed-off-by: smajumdar <titu1994@gmail.com>
2021-06-14 17:39:45 -06:00
Oleksii Kuchaiev df05895bcf Merge branch 'r1.0.2' into main 2021-06-10 23:45:45 -07:00
Somshubra Majumdar a4b9a60bc3
Update notebooks to 1.0.2 release (#2338)
Signed-off-by: smajumdar <titu1994@gmail.com>
2021-06-10 23:45:11 -07:00
Evelina dda599642d
sparrowhawk tests + punctuation post processing for pynini TN (#2320)
* add jenkins test, refactoring

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix new test

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add serial to the default normalizer, add tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* manifest test added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* expose more params, new test cases

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix jenkins, serial clean, exclude range from cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* jenkins dollar sign format

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* addressed review comments

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix decimal in measure

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* move serial in cardinal

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* sh tests init

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* sparrowhawk container tests support added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add post process to normalize.py, update tests

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* remove duplication

Signed-off-by: ekmb <ebakhturina@nvidia.com>
2021-06-10 20:58:23 -07:00
Oleksii Kuchaiev d41307c641 Merge branch 'r1.0.2' into main
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-06-10 18:47:16 -07:00
Oleksii Kuchaiev 245bd49efb update version number
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-06-10 16:54:10 -07:00
Oleksii Kuchaiev 5839aee402 Merge branch 'r1.0.1' into main
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-06-08 22:44:37 -07:00
Oleksii Kuchaiev 2763c67a0d update readmes
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-06-08 22:36:06 -07:00
Oleksii Kuchaiev 4d4f3ebfb8 Merge tag 'v1.0.0' into main
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-06-03 15:49:50 -07:00
Oleksii Kuchaiev 00375818c3 update readme
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
2021-06-03 15:42:45 -07:00