NeMo/tutorials/text_processing
tbartley94 1106ff93c0
WFST_tutorial for ITN development (#3128)
* Pushing WFST_tutorial for open draft. (Still need to review collab code.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Checked tutorial code for WFST_Tutorial is properly functioning. Also included some formatting edits.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Responding to editorial comments for WFST_tutorial

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Added images to folder and wrote README for tutorials

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Few more editorial changes to explain permutations in classification.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Updated tutorials documentation page.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Forgot links for README

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* TOC links were dead

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* More dead links to fix.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* removing collab install and appending a warning instead.

Signed-off-by: tbartley94 <tbartley@nvidia.com>

* Update WFST_Tutorial.ipynb

Signed-off-by: tbartley94 <tbartley@nvidia.com>
2021-11-09 12:18:19 -08:00
..
images WFST_tutorial for ITN development (#3128) 2021-11-09 12:18:19 -08:00
Inverse_Text_Normalization.ipynb typos (#2989) 2021-10-11 14:48:25 -07:00
README.md WFST_tutorial for ITN development (#3128) 2021-11-09 12:18:19 -08:00
Text_Normalization.ipynb typos (#2989) 2021-10-11 14:48:25 -07:00
WFST_Tutorial.ipynb WFST_tutorial for ITN development (#3128) 2021-11-09 12:18:19 -08:00

NeMo Text Processing Tutorials

The NeMo Text Processing module provides support for both Text Normalization (TN) and Inverse Text Normalization (ITN) in order to aid upstream and downstream text processing. The included tutorials are intended to help you quickly become familiar with the interface of the module, as well as guiding you in creating and deploying your own grammars for individual text processing needs.

If you wish to learn more about how to use NeMo's for Text Normalization tasks (e.g. conversion of symbolic strings to verbal form - such as 15 -> "fifteen"), please see the Text Normalization tutorial.

If you wish to learn more about Inverse Text Normalization - the inverse task of converting from verbalized strings to symbolic written form, as may be encountered in downstream ASR - consult the Inverse Text Normalization tutorial.

For those curious about constructing grammars tailored to specific languages and use cases, you may be interested in working through the WFST Tutorial, which goes through NeMo's Normalization process in detail.

As NeMo Text Processing utilizes Weighted Finite State Transducer (WFST) graphs to construct its grammars, a working knowledge of Finite State Automata (FSA) and/or regular languages is suggested. Further, we recommend becoming functionally familiar with the pynini library - which functions as the backend for graph construction - and Sparrowhawk - which NeMo utilizes for grammar deployment.