Merge r1.5.0 bugfixes and doc updates to main (#3133)

* update branch Signed-off-by: ericharper <complex451@gmail.com> * Always save last checkpoint on train end even if folder does not exist (#2976) * add fix for no checkpoint folder when training ends Signed-off-by: Jason <jasoli@nvidia.com> * update Signed-off-by: Jason <jasoli@nvidia.com> * fix test Signed-off-by: Jason <jasoli@nvidia.com> * fixes Signed-off-by: Jason <jasoli@nvidia.com> * typo Signed-off-by: Jason <jasoli@nvidia.com> * change check Signed-off-by: Jason <jasoli@nvidia.com> * [NLP] Add Apex import guard (#3041) * add apex import guard Signed-off-by: ericharper <complex451@gmail.com> * add apex import guard Signed-off-by: ericharper <complex451@gmail.com> * add apex import guard Signed-off-by: ericharper <complex451@gmail.com> * style Signed-off-by: ericharper <complex451@gmail.com> * remove from init add logging to constructor Signed-off-by: ericharper <complex451@gmail.com> * remove from init add logging to constructor Signed-off-by: ericharper <complex451@gmail.com> * remove import from init Signed-off-by: ericharper <complex451@gmail.com> * remove megatron bert encoder logic from NLPModel Signed-off-by: ericharper <complex451@gmail.com> * remove megatron bert from init Signed-off-by: ericharper <complex451@gmail.com> * remove megatron bert from init Signed-off-by: ericharper <complex451@gmail.com> * remove megatron bert from init Signed-off-by: ericharper <complex451@gmail.com> * remove megatron bert from init Signed-off-by: ericharper <complex451@gmail.com> * remove megatron bert from init Signed-off-by: ericharper <complex451@gmail.com> * remove megatron bert from init Signed-off-by: ericharper <complex451@gmail.com> * style Signed-off-by: ericharper <complex451@gmail.com> * Exp manager small refactor (#3067) * Exp manager small refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * move super() call earlier in the function Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> * Change container (#3087) Signed-off-by: smajumdar <titu1994@gmail.com> Co-authored-by: Eric Harper <complex451@gmail.com> * Training of machine translation model fails if config parameter `trainer.max_epochs` is used instead of `trainer.max_steps`. (#3112) * fix: replace distributed_backend for accelarator Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add debug script Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Remove debug script Signed-off-by: PeganovAnton <peganoff2@mail.ru> * update (#3113) Signed-off-by: Jason <jasoli@nvidia.com> * Fix: punctuation capitalization inference on short queries (#3111) Signed-off-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Eric Harper <complex451@gmail.com> * Multiple ASR Fixes to SPE tokenization (#3119) * Reduce num workers for transcribe Signed-off-by: smajumdar <titu1994@gmail.com> * Fix SPE tokenizer vocabulary construction Signed-off-by: smajumdar <titu1994@gmail.com> * Update tokenizer building script Signed-off-by: smajumdar <titu1994@gmail.com> * Remove logs Signed-off-by: smajumdar <titu1994@gmail.com> * Megatron GPT training in BCP (#3095) * BCP megatron training Signed-off-by: madhukar <madhukar@penguin> * Add quotes Signed-off-by: madhukar <madhukar@penguin> * Style fix Signed-off-by: madhukar <madhukar@penguin> Co-authored-by: madhukar <madhukar@penguin> * Upgrade to PTL 1.5.0 (#3127) * update for ptl 1.5.0 Signed-off-by: ericharper <complex451@gmail.com> * update trainer config Signed-off-by: ericharper <complex451@gmail.com> * limit cuda visible devices to the first two gpus on check for ranks CI test Signed-off-by: ericharper <complex451@gmail.com> * remove comments Signed-off-by: ericharper <complex451@gmail.com> * make datasets larger for test Signed-off-by: ericharper <complex451@gmail.com> * make datasets larger for test Signed-off-by: ericharper <complex451@gmail.com> * update compute_max_steps Signed-off-by: ericharper <complex451@gmail.com> * update compute_max_steps Signed-off-by: ericharper <complex451@gmail.com> * update package info Signed-off-by: ericharper <complex451@gmail.com> * remove duplicate code Signed-off-by: ericharper <complex451@gmail.com> * remove comment Signed-off-by: ericharper <complex451@gmail.com> Co-authored-by: Jason <jasoli@nvidia.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Madhukar K <26607911+madhukarkm@users.noreply.github.com> Co-authored-by: madhukar <madhukar@penguin>
2021-11-04 10:26:58 -06:00 · 2021-11-04 10:26:58 -06:00 · aaacc4b089
parent 663c76a972
commit aaacc4b089
21 changed files with 89 additions and 93 deletions
--- a/14
+++ b/14
@ -1,8 +1,8 @@
 pipeline {
  agent {
        docker {
-      image 'gitlab-master.nvidia.com/dl/dgx/pytorch:21.10-py3-devel'
-      args '--device=/dev/nvidia0 --gpus all --user 0:128 -v /home/TestData:/home/TestData -v $HOME/.cache/torch:/root/.cache/torch -v $HOME/.cache/huggingface/transformers:/root/.cache/huggingface/transformers --shm-size=8g'
+      image 'nvcr.io/nvidia/pytorch:21.10-py3'
+      args '--device=/dev/nvidia0 --gpus all --user 0:128 -v /home/TestData:/home/TestData -v $HOME/.cache/torch:/root/.cache/torch --shm-size=8g'
        }
  }
  options {
@ -53,20 +53,12 @@ pipeline {
      }
    }

-
    stage('NeMo Installation') {
      steps {
        sh './reinstall.sh release'
      }
    }

-    // Revert once import guards are added by PTL or version comparing is fixed
-    stage('PTL Import Guards') {
-      steps{
-        sh 'sed -i "s/from pytorch_lightning.callbacks.quantization import QuantizationAwareTraining/try:\\n\\tfrom pytorch_lightning.callbacks.quantization import QuantizationAwareTraining\\nexcept:\\n\\tpass/g" /opt/conda/lib/python3.8/site-packages/pytorch_lightning/callbacks/__init__.py'
-      }
-    }
-
    stage('PyTorch Lightning version') {
      steps {
        sh 'python -c "import pytorch_lightning; print(pytorch_lightning.__version__)"'
@ -75,7 +67,7 @@ pipeline {

    stage('PyTorch Lightning DDP Checks') {
      steps {
-        sh 'python "tests/core_ptl/check_for_ranks.py"'
+        sh 'CUDA_VISIBLE_DEVICES="0,1" python "tests/core_ptl/check_for_ranks.py"'
      }
    }

--- a/examples/nlp/language_modeling/megatron_gpt_pretraining.py
+++ b/examples/nlp/language_modeling/megatron_gpt_pretraining.py
@ -17,6 +17,7 @@ from pathlib import Path
 from omegaconf.omegaconf import OmegaConf
 from pytorch_lightning import Trainer
 from pytorch_lightning.callbacks.timer import Timer
+from pytorch_lightning.plugins.environments.torchelastic_environment import TorchElasticEnvironment
 from pytorch_lightning.trainer.connectors.checkpoint_connector import CheckpointConnector

 from nemo.collections.nlp.models.language_modeling.megatron_gpt_model import MegatronGPTModel
@ -37,24 +38,23 @@ def main(cfg) -> None:
    logging.info("\n\n************** Experiment configuration ***********")
    logging.info(f'\n{OmegaConf.to_yaml(cfg)}')

+    plugins = [NLPDDPPlugin(num_nodes=cfg.trainer.num_nodes)]
    if cfg.trainer.precision == 16:
-        trainer = Trainer(
-            plugins=[
-                NLPDDPPlugin(num_nodes=cfg.trainer.num_nodes),
-                NLPNativeMixedPrecisionPlugin(
-                    init_scale=cfg.model.get('native_amp_init_scale', 2 ** 32),
-                    growth_interval=cfg.model.get('native_amp_growth_interval', 1000),
-                ),
-            ],
-            **cfg.trainer,
+        plugins.append(
+            NLPNativeMixedPrecisionPlugin(
+                init_scale=cfg.model.get('native_amp_init_scale', 2 ** 32),
+                growth_interval=cfg.model.get('native_amp_growth_interval', 1000),
+            )
        )
    elif cfg.trainer.precision == 'bf16':
-        trainer = Trainer(
-            plugins=[NLPDDPPlugin(num_nodes=cfg.trainer.num_nodes), NLPNativeBfloat16PrecisionPlugin(),],
-            **cfg.trainer,
-        )
+        plugins.append(NLPNativeBfloat16PrecisionPlugin())
    else:
-        trainer = Trainer(plugins=[NLPDDPPlugin(num_nodes=cfg.trainer.num_nodes), NLPPrecisionPlugin()], **cfg.trainer)
+        plugins.append(NLPPrecisionPlugin())
+
+    if cfg.get('cluster_type', None) == 'BCP':
+        plugins.append(TorchElasticEnvironment())
+
+    trainer = Trainer(plugins=plugins, **cfg.trainer)

    exp_manager(trainer, cfg.exp_manager)

--- a/nemo/collections/asr/models/ctc_bpe_models.py
+++ b/nemo/collections/asr/models/ctc_bpe_models.py
@ -270,12 +270,13 @@ class EncDecCTCModelBPE(EncDecCTCModel, ASRBPEMixin):
        Returns:
            A pytorch DataLoader for the given audio file(s).
        """
+        batch_size = min(config['batch_size'], len(config['paths2audio_files']))
        dl_config = {
            'manifest_filepath': os.path.join(config['temp_dir'], 'manifest.json'),
            'sample_rate': self.preprocessor._sample_rate,
-            'batch_size': min(config['batch_size'], len(config['paths2audio_files'])),
+            'batch_size': batch_size,
            'shuffle': False,
-            'num_workers': os.cpu_count() - 1,
+            'num_workers': min(batch_size, os.cpu_count() - 1),
            'pin_memory': True,
            'use_start_end_token': self.cfg.validation_ds.get('use_start_end_token', False),
        }
--- a/nemo/collections/asr/models/ctc_models.py
+++ b/nemo/collections/asr/models/ctc_models.py
@ -650,14 +650,15 @@ class EncDecCTCModel(ASRModel, ExportableEncDecModel, ASRModuleMixin):
        Returns:
            A pytorch DataLoader for the given audio file(s).
        """
+        batch_size = min(config['batch_size'], len(config['paths2audio_files']))
        dl_config = {
            'manifest_filepath': os.path.join(config['temp_dir'], 'manifest.json'),
            'sample_rate': self.preprocessor._sample_rate,
            'labels': self.decoder.vocabulary,
-            'batch_size': min(config['batch_size'], len(config['paths2audio_files'])),
+            'batch_size': batch_size,
            'trim_silence': False,
            'shuffle': False,
-            'num_workers': os.cpu_count() - 1,
+            'num_workers': min(batch_size, os.cpu_count() - 1),
            'pin_memory': True,
        }

--- a/nemo/collections/asr/models/rnnt_bpe_models.py
+++ b/nemo/collections/asr/models/rnnt_bpe_models.py
@ -349,12 +349,13 @@ class EncDecRNNTBPEModel(EncDecRNNTModel, ASRBPEMixin):
        Returns:
            A pytorch DataLoader for the given audio file(s).
        """
+        batch_size = min(config['batch_size'], len(config['paths2audio_files']))
        dl_config = {
            'manifest_filepath': os.path.join(config['temp_dir'], 'manifest.json'),
            'sample_rate': self.preprocessor._sample_rate,
-            'batch_size': min(config['batch_size'], len(config['paths2audio_files'])),
+            'batch_size': batch_size,
            'shuffle': False,
-            'num_workers': os.cpu_count() - 1,
+            'num_workers': min(batch_size, os.cpu_count() - 1),
            'pin_memory': True,
            'use_start_end_token': self.cfg.validation_ds.get('use_start_end_token', False),
        }
--- a/nemo/collections/asr/models/rnnt_models.py
+++ b/nemo/collections/asr/models/rnnt_models.py
@ -809,14 +809,15 @@ class EncDecRNNTModel(ASRModel, ASRModuleMixin, ExportableEncDecJointModel):
        Returns:
            A pytorch DataLoader for the given audio file(s).
        """
+        batch_size = min(config['batch_size'], len(config['paths2audio_files']))
        dl_config = {
            'manifest_filepath': os.path.join(config['temp_dir'], 'manifest.json'),
            'sample_rate': self.preprocessor._sample_rate,
            'labels': self.joint.vocabulary,
-            'batch_size': min(config['batch_size'], len(config['paths2audio_files'])),
+            'batch_size': batch_size,
            'trim_silence': False,
            'shuffle': False,
-            'num_workers': os.cpu_count() - 1,
+            'num_workers': min(batch_size, os.cpu_count() - 1),
            'pin_memory': True,
        }

--- a/nemo/collections/asr/parts/mixins/mixins.py
+++ b/nemo/collections/asr/parts/mixins/mixins.py
@ -76,13 +76,12 @@ class ASRBPEMixin(ABC):

            if 'special_tokens' in self.tokenizer_cfg:
                special_tokens = self.tokenizer_cfg['special_tokens']
-            else:
-                special_tokens = None
+
+                if special_tokens is not None:
+                    raise ValueError("`special_tokens` are no longer supported for SentencePiece based tokenizers.")

            # Update special tokens
-            self.tokenizer = tokenizers.SentencePieceTokenizer(
-                model_path=model_path, special_tokens=special_tokens, legacy=True
-            )
+            self.tokenizer = tokenizers.SentencePieceTokenizer(model_path=model_path)

            if 'vocab_path' in self.tokenizer_cfg:
                vocab_path = self.tokenizer_cfg.get('vocab_path')
@ -102,11 +101,11 @@ class ASRBPEMixin(ABC):
                # fallback case for older checkpoints that did not preserve the tokenizer.vocab
                self.spe_vocab_path = None

-            vocabulary = {'<unk>': 0}
-            with open(vocab_path) as f:
-                for i, piece in enumerate(f):
-                    piece = piece.replace('\n', '')
-                    vocabulary[piece] = i + 1
+            vocabulary = {}
+            for i in range(self.tokenizer.vocab_size):
+                piece = self.tokenizer.ids_to_tokens([i])
+                piece = piece[0]
+                vocabulary[piece] = i + 1

            # wrapper method to get vocabulary conveniently
            def get_vocab():
--- a/nemo/collections/nlp/data/token_classification/punctuation_capitalization_dataset.py
+++ b/nemo/collections/nlp/data/token_classification/punctuation_capitalization_dataset.py
@ -529,11 +529,18 @@ def get_features_infer(
        st.append(subtokens)
        stm.append(subtokens_mask)
    _check_max_seq_length_and_margin_and_step(max_seq_length, margin, step)
-    max_seq_length = min(max_seq_length, max(sent_lengths) + 2)
+    if max_seq_length > max(sent_lengths) + 2:
+        max_seq_length = max(sent_lengths) + 2
+        # If `max_seq_length` is greater than maximum length of input query, parameters ``margin`` and ``step`` are
+        # not used will not be used.
+        step = 1
+        # Maximum number of word subtokens in segment. The first and the last tokens in segment are CLS and EOS
+        length = max_seq_length - 2
+    else:
+        # Maximum number of word subtokens in segment. The first and the last tokens in segment are CLS and EOS
+        length = max_seq_length - 2
+        step = min(length - margin * 2, step)
    logging.info(f'Max length: {max_seq_length}')
-    # Maximum number of word subtokens in segment. The first and the last tokens in segment are CLS and EOS
-    length = max_seq_length - 2
-    step = min(length - margin * 2, step)
    get_stats(sent_lengths)
    all_input_ids, all_segment_ids, all_subtokens_mask, all_input_mask, all_input_mask = [], [], [], [], []
    all_quantities_of_preceding_words, all_query_ids, all_is_first, all_is_last = [], [], [], []
--- a/nemo/collections/nlp/parts/nlp_overrides.py
+++ b/nemo/collections/nlp/parts/nlp_overrides.py
@ -54,7 +54,7 @@ class NLPDDPPlugin(DDPPlugin):
    """ DDP plugin for Pytorch Lightning. Needed to customize DDP for model parallel models.
    """

-    distributed_backend = "ddp"
+    accelerator = "ddp"

    def __init__(
        self,
--- a/nemo/core/classes/modelPT.py
+++ b/nemo/core/classes/modelPT.py
@ -460,15 +460,15 @@ class ModelPT(LightningModule, Model):
                optim_config['sched']['t_max_epochs'] = self._trainer.max_epochs
                optim_config['sched']['t_accumulate_grad_batches'] = self._trainer.accumulate_grad_batches
                optim_config['sched']['t_limit_train_batches'] = self._trainer.limit_train_batches
-                if self._trainer.distributed_backend is None:
+                if self._trainer.accelerator is None:
                    optim_config['sched']['t_num_workers'] = self._trainer.num_gpus or 1
-                elif self._trainer.distributed_backend == "ddp_cpu":
+                elif self._trainer.accelerator == "ddp_cpu":
                    optim_config['sched']['t_num_workers'] = self._trainer.num_processes * self._trainer.num_nodes
-                elif self._trainer.distributed_backend == "ddp":
+                elif self._trainer.accelerator == "ddp":
                    optim_config['sched']['t_num_workers'] = self._trainer.num_gpus * self._trainer.num_nodes
                else:
                    logging.warning(
-                        f"The lightning trainer received accelerator: {self._trainer.distributed_backend}. We "
+                        f"The lightning trainer received accelerator: {self._trainer.accelerator}. We "
                        "recommend to use 'ddp' instead."
                    )
                    optim_config['sched']['t_num_workers'] = self._trainer.num_gpus * self._trainer.num_nodes
--- a/nemo/core/config/pytorch_lightning.py
+++ b/nemo/core/config/pytorch_lightning.py
@ -93,6 +93,9 @@ class TrainerConfig:
    reload_dataloaders_every_n_epochs: int = 0
    ipus: Optional[int] = None
    devices: Any = None
+    strategy: Any = None
+    enable_checkpointing: bool = True
+    enable_model_summary: bool = True


 # Register the trainer config.
--- a/nemo/core/optim/lr_scheduler.py
+++ b/nemo/core/optim/lr_scheduler.py
@ -786,8 +786,6 @@ def compute_max_steps(
    elif steps_per_epoch != float('inf'):
        # limit_train_batches is a percentage of batches per epoch
        steps_per_epoch = int(steps_per_epoch * limit_train_batches)
-        if accumulate_grad_batches == 1:
-            steps_per_epoch = max(steps_per_epoch, 1)

    return math.ceil(steps_per_epoch / accumulate_grad_batches) * max_epochs

--- a/nemo/package_info.py
+++ b/nemo/package_info.py
@ -16,7 +16,7 @@
 MAJOR = 1
 MINOR = 5
 PATCH = 0
-PRE_RELEASE = 'b1'
+PRE_RELEASE = ''

 # Use the following formatting: (major, minor, patch, pre-release)
 VERSION = (MAJOR, MINOR, PATCH, PRE_RELEASE)
--- a/nemo/utils/exp_manager.py
+++ b/nemo/utils/exp_manager.py
@ -732,10 +732,6 @@ class NeMoModelCheckpoint(ModelCheckpoint):
        self.best_model_path = best_k_models[0]
        self.best_model_score = self.best_k_models[self.best_model_path]

-        # # uninject mp_rank from paths
-        # self.kth_best_model_path = self._uninject_mp_rank(self.kth_best_model_path)
-        # self.best_model_path = self._uninject_mp_rank(self.best_model_path)
-
    @staticmethod
    def _uninject_mp_rank(filepath):
        dirname = os.path.dirname(os.path.dirname(filepath))
--- a/reinstall.sh
+++ b/reinstall.sh
@ -10,9 +10,6 @@ echo 'Uninstalling stuff'
 ${PIP} uninstall -y nemo_toolkit
 ${PIP} uninstall -y sacrebleu

-# TODO: revert when 1.5.0 is out
-${PIP} uninstall -y pytorch-lightning
-
 # Kept for legacy purposes
 ${PIP} uninstall -y nemo_asr
 ${PIP} uninstall -y nemo_nlp
@ -22,9 +19,6 @@ ${PIP} uninstall -y nemo_cv

 ${PIP} install -U setuptools

-# TODO: revert when 1.5.0 is out
-${PIP} install pytorch-lightning==1.5.0rc0
-
 echo 'Installing nemo and nemo_text_processing'
 if [[ "$INSTALL_OPTION" == "dev" ]]; then
    ${PIP} install --editable ".[all]"
--- a/requirements/requirements_lightning.txt
+++ b/requirements/requirements_lightning.txt
@ -1,4 +1,4 @@
-pytorch-lightning>1.4.9
+pytorch-lightning>=1.5.0
 torchmetrics>=0.4.1rc0
 transformers>=4.0.1
 webdataset>=0.1.48,<=0.1.62
--- a/scripts/tokenizers/process_asr_text_tokenizer.py
+++ b/scripts/tokenizers/process_asr_text_tokenizer.py
@ -73,9 +73,14 @@
 #   --spe_max_sentencepiece_length: Limits the maximum length that any any SentencePiece subword can be.
 #       Using this will change the subword tokens generated.
 #
+#   --spe_pad: Adds <pad> as special token.
+#
+#   --spe_bos: Adds <s> as Begining-of-Sentence special token.
+#
+#   --spe_eos: Adds </s> as End-of-Sentence special token.
+#
 #   --log: Whether the script should display log messages

-
 import argparse
 import json
 import logging
@ -205,8 +210,10 @@ def __process_data(
    Returns:
    """
    if tokenizer_type == 'spe':
+
+        # Prepare directory of tokenizer
        if spe_max_sentencepiece_length > 0:
-            tokenizer_dir = os.path.join(dst_folder, 'tokenizer_{}_{}_v{}_max{}').format(
+            tokenizer_dir = os.path.join(dst_folder, 'tokenizer_{}_{}_v{}_max_{}').format(
                tokenizer_type, spe_type, vocab_size, spe_max_sentencepiece_length
            )
        else:
@ -214,6 +221,13 @@ def __process_data(
                tokenizer_type, spe_type, vocab_size
            )

+        if spe_pad:
+            tokenizer_dir = f'{tokenizer_dir}_pad'
+        if spe_bos:
+            tokenizer_dir = f'{tokenizer_dir}_bos'
+        if spe_eos:
+            tokenizer_dir = f'{tokenizer_dir}_eos'
+
        if not os.path.exists(tokenizer_dir):
            os.makedirs(tokenizer_dir)

@ -221,6 +235,7 @@ def __process_data(
            logging.warning("Model file already exists, overriding old model file !")
            os.remove(os.path.join(tokenizer_dir, 'tokenizer.model'))

+        # Build tokenizer
        tokenizer_path, vocab_path = create_spt_model(
            data_file=text_path,
            vocab_size=vocab_size,
--- a/tests/collections/asr/test_asr_ctc_encoder_model_bpe.py
+++ b/tests/collections/asr/test_asr_ctc_encoder_model_bpe.py
@ -144,6 +144,7 @@ class TestEncDecCTCModel:
            assert new_model.vocab_path.endswith('_vocab.txt')
            assert new_model.spe_vocab_path.endswith('_tokenizer.vocab')

+            assert new_model.tokenizer.tokenizer.vocab_size == 128
            assert len(new_model.tokenizer.tokenizer.get_vocab()) == 128

    @pytest.mark.unit
--- a/tests/core/test_optimizers_schedulers.py
+++ b/tests/core/test_optimizers_schedulers.py
@ -39,14 +39,13 @@ class TempModel(torch.nn.Module):

 class OptCounter(torch.optim.SGD):
    def __init__(self, *args, **kwargs):
-        self.count = 0
        super().__init__(*args, **kwargs)
+        for group in self.param_groups:
+            group.setdefault('count', 0)

    def step(self, closure=None):
-        try:
-            self.count += 1
-        except AttributeError:
-            self.count = 1
+        for group in self.param_groups:
+            group['count'] += 1
        super().step(closure)


@ -88,7 +87,8 @@ class ExampleModel(pl.LightningModule):
 class Callback(pl.callbacks.Callback):
    @pl.utilities.distributed.rank_zero_only
    def on_train_end(self, trainer, module):
-        if trainer.global_step != module.my_opt.count or trainer.global_step != module.max_steps:
+        count = module.my_opt.param_groups[0]['count']
+        if trainer.global_step != count or trainer.global_step != module.max_steps:
            logging.debug(f"max_epochs: {trainer.max_epochs}")
            logging.debug(f"accumulate_grad_batches: {trainer.accumulate_grad_batches}")
            logging.debug(f"limit_train_batches: {trainer.limit_train_batches}")
@ -98,12 +98,8 @@ class Callback(pl.callbacks.Callback):
            logging.debug(f"drop_last: {module.drop_last}")
            logging.debug(f"{len(trainer.train_dataloader)}")
            logging.debug(f"{trainer.num_training_batches }")
-        assert (
-            trainer.global_step == module.my_opt.count
-        ), f"{trainer.global_step} != {module.my_opt.count} != {module.max_steps}"
-        assert (
-            trainer.global_step == module.max_steps
-        ), f"{trainer.global_step} != {module.my_opt.count} != {module.max_steps}"
+        assert trainer.global_step == count, f"{trainer.global_step} != {count} != {module.max_steps}"
+        assert trainer.global_step == module.max_steps, f"{trainer.global_step} != {count} != {module.max_steps}"


 class TestOptimizersSchedulers:
--- a/tutorials/tts/FastPitch_Finetuning.ipynb
+++ b/tutorials/tts/FastPitch_Finetuning.ipynb
@ -54,10 +54,10 @@
    "3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select \"GPU\" for hardware accelerator)\n",
    "4. Run this cell to set up dependencies.\n",
    "\"\"\"\n",
+    "BRANCH = 'main'\n",
    "# # If you're using Google Colab and not running locally, uncomment and run this cell.\n",
    "# !apt-get install sox libsndfile1 ffmpeg\n",
    "# !pip install wget unidecode\n",
-    "# BRANCH = 'main'\n",
    "# !python -m pip install git+https://github.com/NeMo/NeMo.git@$BRANCH#egg=nemo_toolkit[tts]"
   ]
  },
--- a/tutorials/tts/Tacotron2_Training.ipynb
+++ b/tutorials/tts/Tacotron2_Training.ipynb
@ -54,10 +54,10 @@
    "3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select \"GPU\" for hardware accelerator)\n",
    "4. Run this cell to set up dependencies# .\n",
    "\"\"\"\n",
+    "BRANCH = 'main'\n",
    "# # If you're using Colab and not running locally, uncomment and run this cell.\n",
    "# !apt-get install sox libsndfile1 ffmpeg\n",
    "# !pip install wget unidecode\n",
-    "# BRANCH = 'main'\n",
    "# !python -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[tts]"
   ]
  },
@ -154,8 +154,8 @@
   "source": [
    "# NeMo's training scripts are stored inside the examples/ folder. Let's grab the tacotron2.py file\n",
    "# as well as the tacotron2.yaml file\n",
-    "!wget https://raw.githubusercontent.com/NVIDIA/NeMo/v1.0.2/examples/tts/tacotron2.py\n",
-    "!mkdir conf && cd conf && wget https://raw.githubusercontent.com/NVIDIA/NeMo/v1.0.2/examples/tts/conf/tacotron2.yaml && cd .."
+    "!wget https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/tts/tacotron2.py\n",
+    "!mkdir conf && cd conf && wget https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/examples/tts/conf/tacotron2.yaml && cd .."
   ]
  },
  {
@ -306,15 +306,6 @@
    "python tacotron2.py train_dataset=YOUR_TRAIN.json validation_datasets=YOUR_VAL.json trainer.gpus=-1\n",
    "```"
   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "id": "2KctbQ61MmHy"
-   },
-   "outputs": [],
-   "source": []
  }
 ],
 "metadata": {