* Pushing WFST_tutorial for open draft. (Still need to review collab code. Signed-off-by: tbartley94 <tbartley@nvidia.com> * Checked tutorial code for WFST_Tutorial is properly functioning. Also included some formatting edits. Signed-off-by: tbartley94 <tbartley@nvidia.com> * Responding to editorial comments for WFST_tutorial Signed-off-by: tbartley94 <tbartley@nvidia.com> * Added images to folder and wrote README for tutorials Signed-off-by: tbartley94 <tbartley@nvidia.com> * Few more editorial changes to explain permutations in classification. Signed-off-by: tbartley94 <tbartley@nvidia.com> * Updated tutorials documentation page. Signed-off-by: tbartley94 <tbartley@nvidia.com> * Forgot links for README Signed-off-by: tbartley94 <tbartley@nvidia.com> * TOC links were dead Signed-off-by: tbartley94 <tbartley@nvidia.com> * More dead links to fix. Signed-off-by: tbartley94 <tbartley@nvidia.com> * removing collab install and appending a warning instead. Signed-off-by: tbartley94 <tbartley@nvidia.com> * Update WFST_Tutorial.ipynb Signed-off-by: tbartley94 <tbartley@nvidia.com>
1.9 KiB
NeMo Text Processing Tutorials
The NeMo Text Processing module provides support for both Text Normalization (TN) and Inverse Text Normalization (ITN) in order to aid upstream and downstream text processing. The included tutorials are intended to help you quickly become familiar with the interface of the module, as well as guiding you in creating and deploying your own grammars for individual text processing needs.
If you wish to learn more about how to use NeMo's for Text Normalization tasks (e.g. conversion
of symbolic strings to verbal form - such as 15
-> "fifteen"), please see the Text Normalization
tutorial.
If you wish to learn more about Inverse Text Normalization - the inverse task of converting
from verbalized strings to symbolic written form, as may be encountered in downstream ASR -
consult the Inverse Text Normalization
tutorial.
For those curious about constructing grammars tailored to specific languages and use cases,
you may be interested in working through the WFST Tutorial
, which goes through NeMo's Normalization
process in detail.
As NeMo Text Processing utilizes Weighted Finite State Transducer (WFST) graphs to construct its
grammars, a working knowledge of Finite State Automata (FSA) and/or regular languages is suggested.
Further, we recommend becoming functionally familiar with the pynini
library - which functions
as the backend for graph construction - and Sparrowhawk - which NeMo utilizes for grammar deployment.