diff --git a/PyTorch/SpeechRecognition/Jasper/.dockerignore b/PyTorch/SpeechRecognition/Jasper/.dockerignore index 41519be2..a620be2e 100755 --- a/PyTorch/SpeechRecognition/Jasper/.dockerignore +++ b/PyTorch/SpeechRecognition/Jasper/.dockerignore @@ -1,3 +1,4 @@ +*.pt results/ *__pycache__ checkpoints/ @@ -5,5 +6,3 @@ checkpoints/ datasets/ external/tensorrt-inference-server/ checkpoints/ -triton/model_repo -triton/deploy diff --git a/PyTorch/SpeechRecognition/Jasper/.gitmodules b/PyTorch/SpeechRecognition/Jasper/.gitmodules deleted file mode 100644 index 772eb3e5..00000000 --- a/PyTorch/SpeechRecognition/Jasper/.gitmodules +++ /dev/null @@ -1,4 +0,0 @@ -[submodule "external/triton-inference-server"] - path = external/triton-inference-server - url = https://github.com/NVIDIA/triton-inference-server - branch = r19.12 diff --git a/PyTorch/SpeechRecognition/Jasper/Dockerfile b/PyTorch/SpeechRecognition/Jasper/Dockerfile index 18948e19..06bbde50 100755 --- a/PyTorch/SpeechRecognition/Jasper/Dockerfile +++ b/PyTorch/SpeechRecognition/Jasper/Dockerfile @@ -12,11 +12,10 @@ # See the License for the specific language governing permissions and # limitations under the License. -ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:20.06-py3 +ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:20.10-py3 FROM ${FROM_IMAGE_NAME} - -RUN apt-get update && apt-get install -y libsndfile1 && apt-get install -y sox && rm -rf /var/lib/apt/lists/* +RUN apt update && apt install -y libsndfile1 && apt install -y sox && rm -rf /var/lib/apt/lists/* WORKDIR /workspace/jasper @@ -24,5 +23,7 @@ WORKDIR /workspace/jasper COPY requirements.txt . RUN pip install --disable-pip-version-check -U -r requirements.txt +RUN pip install --force-reinstall --extra-index-url https://developer.download.nvidia.com/compute/redist/nightly nvidia-dali-nightly-cuda110==0.28.0.dev20201026 + # Copy rest of files COPY . . diff --git a/PyTorch/SpeechRecognition/Jasper/README.md b/PyTorch/SpeechRecognition/Jasper/README.md index 9b2d0810..8979a3d0 100644 --- a/PyTorch/SpeechRecognition/Jasper/README.md +++ b/PyTorch/SpeechRecognition/Jasper/README.md @@ -24,7 +24,6 @@ This repository provides scripts to train the Jasper model to achieve near state * [Training process](#training-process) * [Inference process](#inference-process) * [Evaluation process](#evaluation-process) - * [Deploying Jasper using TensorRT](#deploying-jasper-using-tensorrt) * [Deploying Jasper using Triton Inference Server](#deploying-jasper-using-triton-inference) - [Performance](#performance) * [Benchmarking](#benchmarking) @@ -32,16 +31,16 @@ This repository provides scripts to train the Jasper model to achieve near state * [Inference performance benchmark](#inference-performance-benchmark) * [Results](#results) * [Training accuracy results](#training-accuracy-results) - * [Training accuracy: NVIDIA DGX A100 (8x A100 40GB)](#training-accuracy-nvidia-dgx-a100-8x-a100-40gb) + * [Training accuracy: NVIDIA DGX A100 (8x A100 80GB)](#training-accuracy-nvidia-dgx-a100-8x-a100-80gb) * [Training accuracy: NVIDIA DGX-1 (8x V100 32GB)](#training-accuracy-nvidia-dgx-1-8x-v100-32gb) * [Training stability test](#training-stability-test) * [Training performance results](#training-performance-results) - * [Training performance: NVIDIA DGX A100 (8x A100 40GB)](#training-performance-nvidia-dgx-a100-8x-a100-40gb) + * [Training performance: NVIDIA DGX A100 (8x A100 80GB)](#training-performance-nvidia-dgx-a100-8x-a100-80gb) * [Training performance: NVIDIA DGX-1 (8x V100 16GB)](#training-performance-nvidia-dgx-1-8x-v100-16gb) * [Training performance: NVIDIA DGX-1 (8x V100 32GB)](#training-performance-nvidia-dgx-1-8x-v100-32gb) * [Training performance: NVIDIA DGX-2 (16x V100 32GB)](#training-performance-nvidia-dgx-2-16x-v100-32gb) * [Inference performance results](#inference-performance-results) - * [Inference performance: NVIDIA DGX A100 (1x A100 40GB)](#inference-performance-nvidia-dgx-a100-gpu-1x-a100-40gb) + * [Inference performance: NVIDIA DGX A100 (1x A100 80GB)](#inference-performance-nvidia-dgx-a100-gpu-1x-a100-80gb) * [Inference performance: NVIDIA DGX-1 (1x V100 16GB)](#inference-performance-nvidia-dgx-1-1x-v100-16gb) * [Inference performance: NVIDIA DGX-1 (1x V100 32GB)](#inference-performance-nvidia-dgx-1-1x-v100-32gb) * [Inference performance: NVIDIA DGX-2 (1x V100 32GB)](#inference-performance-nvidia-dgx-2-1x-v100-32gb) @@ -217,10 +216,10 @@ Uses both acoustic model and language model to output the transcript of an input The following section lists the requirements in order to start training and evaluating the Jasper model. ### Requirements -This repository contains a `Dockerfile` which extends the PyTorch 20.06-py3 NGC container and encapsulates some dependencies. Aside from these dependencies, ensure you have the following components: +This repository contains a `Dockerfile` which extends the PyTorch 20.10-py3 NGC container and encapsulates some dependencies. Aside from these dependencies, ensure you have the following components: * [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker) -* [PyTorch 20.06-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) +* [PyTorch 20.10-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) - Supported GPUs: - [NVIDIA Volta architecture](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/) - [NVIDIA Turing architecture](https://www.nvidia.com/en-us/geforce/turing/) @@ -260,10 +259,10 @@ bash scripts/docker/build.sh 3. Start an interactive session in the NGC container to run data download/training/inference ```bash -bash scripts/docker/launch.sh +bash scripts/docker/launch.sh ``` Within the container, the contents of this repository will be copied to the `/workspace/jasper` directory. The `/datasets`, `/checkpoints`, `/results` directories are mounted as volumes -and mapped to the corresponding directories ``, ``, `` on the host. +and mapped to the corresponding directories ``, ``, `` on the host. 4. Download and preprocess the dataset. @@ -282,40 +281,49 @@ Inside the container, download and extract the datasets into the required format bash scripts/download_librispeech.sh ``` Once the data download is complete, the following folders should exist: - -* `/datasets/LibriSpeech/` - * `train-clean-100/` - * `train-clean-360/` - * `train-other-500/` - * `dev-clean/` - * `dev-other/` - * `test-clean/` - * `test-other/` +```bash +datasets/LibriSpeech/ +├── dev-clean +├── dev-other +├── test-clean +├── test-other +├── train-clean-100 +├── train-clean-360 +└── train-other-500 +``` Since `/datasets/` is mounted to `` on the host (see Step 3), once the dataset is downloaded it will be accessible from outside of the container at `/LibriSpeech`. -Next, convert the data into WAV files and add speed perturbation with 0.9 and 1.1 to the training files: +Next, convert the data into WAV files: ```bash bash scripts/preprocess_librispeech.sh ``` Once the data is converted, the following additional files and folders should exist: -* `datasets/LibriSpeech/` - * `librispeech-train-clean-100-wav.json` - * `librispeech-train-clean-360-wav.json` - * `librispeech-train-other-500-wav.json` - * `librispeech-dev-clean-wav.json` - * `librispeech-dev-other-wav.json` - * `librispeech-test-clean-wav.json` - * `librispeech-test-other-wav.json` - * `train-clean-100-wav/` containsWAV files with original speed, 0.9 and 1.1 - * `train-clean-360-wav/` contains WAV files with original speed, 0.9 and 1.1 - * `train-other-500-wav/` contains WAV files with original speed, 0.9 and 1.1 - * `dev-clean-wav/` - * `dev-other-wav/` - * `test-clean-wav/` - * `test-other-wav/` +```bash +datasets/LibriSpeech/ +├── dev-clean-wav +├── dev-other-wav +├── librispeech-train-clean-100-wav.json +├── librispeech-train-clean-360-wav.json +├── librispeech-train-other-500-wav.json +├── librispeech-dev-clean-wav.json +├── librispeech-dev-other-wav.json +├── librispeech-test-clean-wav.json +├── librispeech-test-other-wav.json +├── test-clean-wav +├── test-other-wav +├── train-clean-100-wav +├── train-clean-360-wav +└── train-other-500-wav +``` +The DALI data pre-processing pipeline, which is enabled by default, performs speed perturbation on-line during training. +Without DALI, on-line speed perturbation might slow down the training. +If you wish to disable DALI, speed perturbation can be computed off-line with: +```bash +SPEEDS="0.9 1.1" bash scripts/preprocess_librispeech.sh +``` 5. Start training. @@ -323,22 +331,22 @@ Inside the container, use the following script to start training. Make sure the downloaded and preprocessed dataset is located at `/LibriSpeech` on the host (see Step 3), which corresponds to `/datasets/LibriSpeech` inside the container. ```bash -bash scripts/train.sh [OPTIONS] +[OPTION1=value1 OPTION2=value2 ...] bash scripts/train.sh ``` By default automatic precision is disabled, batch size is 64 over two gradient accumulation steps, and the recipe is run on a total of 8 GPUs. The hyperparameters are tuned for a GPU with at least 32GB of memory and will require adjustment for 16GB GPUs (e.g., by lowering batch size and using more gradient accumulation steps). -More details on available [OPTIONS] can be found in [Parameters](#parameters) and [Training process](#training-process). +Options are being passed as environment variables. More details on available options can be found in [Parameters](#parameters) and [Training process](#training-process). 6. Start validation/evaluation. Inside the container, use the following script to run evaluation. Make sure the downloaded and preprocessed dataset is located at `/LibriSpeech` on the host (see Step 3), which corresponds to `/datasets/LibriSpeech` inside the container. ```bash -bash scripts/evaluation.sh [OPTIONS] +[OPTION1=value1 OPTION2=value2 ...] bash scripts/evaluation.sh [OPTIONS] ``` By default, this will use full precision, a batch size of 64 and run on a single GPU. -More details on available [OPTIONS] can be found in [Parameters](#parameters) and [Evaluation process](#evaluation-process). +Options are being passed as environment variables. More details on available options can be found in [Parameters](#parameters) and [Evaluation process](#evaluation-process). 7. Start inference/predictions. @@ -348,11 +356,11 @@ Inside the container, use the following script to run inference. A pretrained model checkpoint can be downloaded from [NGC model repository](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16). ```bash -bash scripts/inference.sh [OPTIONS] +[OPTION1=value1 OPTION2=value2 ...] bash scripts/inference.sh ``` By default this will use single precision, a batch size of 64 and run on a single GPU. -More details on available [OPTIONS] can be found in [Parameters](#parameters) and [Inference process](#inference-process). +Options are being passed as environment variables. More details on available options can be found in [Parameters](#parameters) and [Inference process](#inference-process). ## Advanced @@ -362,31 +370,27 @@ The following sections provide greater details of the dataset, running training ### Scripts and sample code In the `root` directory, the most important files are: -* `train.py` - Serves as entry point for training -* `inference.py` - Serves as entry point for inference and evaluation -* `model.py` - Contains the model architecture -* `dataset.py` - Contains the data loader and related functionality -* `optimizer.py` - Contains the optimizer -* `inference_benchmark.py` - Serves as inference benchmarking script that measures the latency of pre-processing and the acoustic model -* `requirements.py` - Contains the required dependencies that are installed when building the Docker container -* `Dockerfile` - Container with the basic set of dependencies to run Jasper - -The `scripts/` folder encapsulates all the one-click scripts required for running various supported functionalities, such as: -* `train.sh` - Runs training using the `train.py` script -* `inference.sh` - Runs inference using the `inference.py` script -* `evaluation.sh` - Runs evaluation using the `inference.py` script -* `download_librispeech.sh` - Downloads LibriSpeech dataset -* `preprocess_librispeech.sh` - Preprocess LibriSpeech raw data files to be ready for training and inference -* `inference_benchmark.sh` - Runs the inference benchmark using the `inference_benchmark.py` script -* `train_benchmark.sh` - Runs the training performance benchmark using the `train.py` script -* `docker/` - Contains the scripts for building and launching the container - - -Other folders included in the `root` directory are: -* `notebooks/` - Jupyter notebooks and example audio files -* `configs/ - model configurations -* `utils/` - data downloading and common routines -* `parts/` - data pre-processing +``` +jasper +├── common # data pre-processing, logging, etc. +├── configs # model configurations +├── Dockerfile # container with the basic set of dependencies to run Jasper +├── inference.py # entry point for inference +├── jasper # model-specific code +├── notebooks # jupyter notebooks and example audio files +├── scripts # one-click scripts required for running various supported functionalities +│   ├── docker # contains the scripts for building and launching the container +│   ├── download_librispeech.sh # downloads LibriSpeech dataset +│   ├── evaluation.sh # runs evaluation using the `inference.py` script +│   ├── inference_benchmark.sh # runs the inference benchmark using the `inference_benchmark.py` script +│   ├── inference.sh # runs inference using the `inference.py` script +│   ├── preprocess_librispeech.sh # preprocess LibriSpeech raw data files for training and inference +│   ├── train_benchmark.sh # runs the training performance benchmark using the `train.py` script +│   └── train.sh # runs training using the `train.py` script +├── train.py # entry point for training +├── triton # example of inference using Triton Inference Server +└── utils # data downloading and common routines +``` ### Parameters @@ -394,77 +398,94 @@ Parameters could be set as env variables, or passed as positional arguments. The complete list of available parameters for `scripts/train.sh` script contains: ```bash - DATA_DIR: directory of dataset. (default: '/datasets/LibriSpeech') - MODEL_CONFIG: relative path to model configuration. (default: 'configs/jasper10x5dr_sp_offline_specaugment.toml') - RESULT_DIR: directory for results, logs, and created checkpoints. (default: '/results') - CHECKPOINT: model checkpoint to continue training from. Model checkpoint is a dictionary object that contains apart from the model weights the optimizer state as well as the epoch number. If CHECKPOINT is set, training starts from scratch. (default: "") - CREATE_LOGFILE: boolean that indicates whether to create a training log that will be stored in `$RESULT_DIR`. (default: true) - CUDNN_BENCHMARK: boolean that indicates whether to enable cudnn benchmark mode for using more optimized kernels. (default: true) - NUM_GPUS: number of GPUs to use. (default: 8) - AMP: if set to `true`, enables automatic mixed precision (default: false) - EPOCHS: number of training epochs. (default: 400) - SEED: seed for random number generator and used for ensuring reproducibility. (default: 6) - BATCH_SIZE: data batch size. (default: 64) - LEARNING_RATE: Initial learning rate. (default: 0.015) - GRADIENT_ACCUMULATION_STEPS: number of gradient accumulation steps until optimizer updates weights. (default: 2) +DATA_DIR: directory of dataset. (default: '/datasets/LibriSpeech') +MODEL_CONFIG: relative path to model configuration. (default: 'configs/jasper10x5dr_speedp-online_speca.yaml') +OUTPUT_DIR: directory for results, logs, and created checkpoints. (default: '/results') +CHECKPOINT: a specific model checkpoint to continue training from. To resume training from the last checkpoint, see the RESUME option. +RESUME: resume training from the last checkpoint found in OUTPUT_DIR, or from scratch if there are no checkpoints (default: true) +CUDNN_BENCHMARK: boolean that indicates whether to enable cudnn benchmark mode for using more optimized kernels. (default: true) +NUM_GPUS: number of GPUs to use. (default: 8) +AMP: if set to `true`, enables automatic mixed precision (default: false) +BATCH_SIZE: effective data batch size. The real batch size per GPU might be lower, if gradient accumulation is enabled (default: 64) +GRAD_ACCUMULATION_STEPS: number of gradient accumulation steps until optimizer updates weights. (default: 2) +LEARNING_RATE: initial learning rate. (default: 0.01) +MIN_LEARNING_RATE: minimum learning rate, despite LR scheduling (default: 1e-5) +LR_POLICY: how to decay LR (default: exponential) +LR_EXP_GAMMA: decay factor for the exponential LR schedule (default: 0.981) +EMA: decay factor for exponential averages of checkpoints (default: 0.999) +SEED: seed for random number generator and used for ensuring reproducibility. (default: 0) +EPOCHS: number of training epochs. (default: 440) +WARMUP_EPOCHS: number of initial epoch of linearly increasing LR. (default: 2) +HOLD_EPOCHS: number of epochs to hold maximum LR after warmup. (default: 140) +SAVE_FREQUENCY: number of epochs between saving the model to disk. (default: 10) +EPOCHS_THIS_JOB: run training for this number of epochs. Does not affect LR schedule like the EPOCHS parameter. (default: 0) +DALI_DEVICE: device to run the DALI pipeline on for calculation of filterbanks. Valid choices: cpu, gpu, none. (default: gpu) +PAD_TO_MAX_DURATION: pad all sequences with zeros to maximum length. (default: false) +EVAL_FREQUENCY: number of steps between evaluations on the validation set. (default: 544) +PREDICTION_FREQUENCY: the number of steps between writing a sample prediction to stdout. (default: 544) +TRAIN_MANIFESTS: lists of .json training set files +VAL_MANIFESTS: lists of .json validation set files + ``` The complete list of available parameters for `scripts/inference.sh` script contains: ```bash -DATA_DIR: directory of dataset. (default: '/datasets/LibriSpeech') -DATASET: name of dataset to use. (default: 'dev-clean') -MODEL_CONFIG: model configuration. (default: 'configs/jasper10x5dr_sp_offline_specaugment.toml') -RESULT_DIR: directory for results and logs. (default: '/results') -CHECKPOINT: model checkpoint path. (required) -CREATE_LOGFILE: boolean that indicates whether to create a log file that will be stored in `$RESULT_DIR`. (default: true) -CUDNN_BENCHMARK: boolean that indicates whether to enable cudnn benchmark mode for using more optimized kernels. (default: false) -AMP: if set to `true`, enables FP16 inference with AMP (default: false) -NUM_STEPS: number of inference steps. If -1 runs inference on entire dataset. (default: -1) -SEED: seed for random number generator and useful for ensuring reproducibility. (default: 6) -BATCH_SIZE: data batch size.(default: 64) -LOGITS_FILE: destination path for serialized model output with binary protocol. If 'none' does not save model output. (default: 'none') -PREDICTION_FILE: destination path for saving predictions. If 'none' does not save predictions. (default: '${RESULT_DIR}/${DATASET}.predictions) +DATA_DIR: directory of dataset. (default: '/datasets/LibriSpeech') +MODEL_CONFIG: model configuration. (default: 'configs/jasper10x5dr_speedp-online_speca.yaml') +OUTPUT_DIR: directory for results and logs. (default: '/results') +CHECKPOINT: model checkpoint path. (required) +DATASET: name of the LibriSpeech subset to use. (default: 'dev-clean') +LOG_FILE: path to the DLLogger .json logfile. (default: '') +CUDNN_BENCHMARK: enable cudnn benchmark mode for using more optimized kernels. (default: false) +MAX_DURATION: filter out recordings shorter then MAX_DURATION seconds. (default: "") +PAD_TO_MAX_DURATION: pad all sequences with zeros to maximum length. (default: false) +NUM_GPUS: number of GPUs to use. Note that with > 1 GPUs WER results might be innaccurate due to the batching policy. (default: 1) +NUM_STEPS: number of batches to evaluate, loop the dataset if necessary. (default: 0) +NUM_WARMUP_STEPS: number of initial steps before measuring preformance. (default: 0) +AMP: enable FP16 inference with AMP. (default: false) +BATCH_SIZE: data batch size. (default: 64) +EMA: Attempt to load exponentially averaged weights from a checkpoint. (default: true) +SEED: seed for random number generator and used for ensuring reproducibility. (default: 0) +DALI_DEVICE: device to run the DALI pipeline on for calculation of filterbanks. Valid choices: cpu, gpu, none. (default: gpu) +CPU: run inference on CPU. (default: false) +LOGITS_FILE: dump logit matrices to a file. (default: "") +PREDICTION_FILE: save predictions to a file. (default: "${OUTPUT_DIR}/${DATASET}.predictions") ``` -The complete list of available parameters for `scripts/evaluation.sh` script contains: +The complete list of available parameters for `scripts/evaluation.sh` is the same as for `scripts/inference.sh` except for the few default changes. ```bash -DATA_DIR: directory of dataset.(default: '/datasets/LibriSpeech') -DATASET: name of dataset to use.(default: 'dev-clean') -MODEL_CONFIG: model configuration.(default: 'configs/jasper10x5dr_sp_offline_specaugment.toml') -RESULT_DIR: directory for results and logs. (default: '/results') -CHECKPOINT: model checkpoint path. (required) -CREATE_LOGFILE: boolean that indicates whether to create a log file that will be stored in `$RESULT_DIR`. (default: true) -CUDNN_BENCHMARK: boolean that indicates whether to enable cudnn benchmark mde for using more optimized kernels. (default: false) -NUM_GPUS: number of GPUs to run evaluation on (default: 1) -AMP: if set to `true`, enables FP16 with AMP (default: false) -NUM_STEPS: number of inference steps per GPU. If -1 runs inference on entire dataset (default: -1) -SEED: seed for random number generator and useful for ensuring reproducibility. (default: 0) -BATCH_SIZE: data batch size.(default: 64) +PREDICTION_FILE: (default: "") +DATASET: (default: "test-other") ``` -The `scripts/inference_benchmark.sh` script pads all input to the same length and computes the mean, 90%, 95%, 99% percentile of latency for the specified number of inference steps. Latency is measured in millisecond per batch. The `scripts/inference_benchmark.sh` -measures latency for a single GPU and extends `scripts/inference.sh` by : +The `scripts/inference_benchmark.sh` script pads all input to a fixed duration and computes the mean, 90%, 95%, 99% percentile of latency for the specified number of inference steps. Latency is measured in milliseconds per batch. The `scripts/inference_benchmark.sh` measures latency for a single GPU and loops over a number of batch sizes and durations. It extends `scripts/inference.sh`, and changes the defaults with: ```bash - MAX_DURATION: filters out input audio data that exceeds a maximum number of seconds. This ensures that when all filtered audio samples are padded to maximum length that length will stay under this specified threshold (default: 36) +BATCH_SIZE_SEQ: batch sizes to measre on. (defaul: "1 2 4 8 16") +MAX_DURATION_SEQ: input durations (in seconds) to measure on (default: "2 7 16.7") +CUDNN_BENCHMARK: (default: true) +PAD_TO_MAX_DURATION: (default: true) +NUM_WARMUP_STEPS: (default: 10) +NUM_STEPS: (default: 500) +DALI_DEVICE: (default: cpu) ``` The `scripts/train_benchmark.sh` script pads all input to the same length according to the input argument `MAX_DURATION` and measures average training latency and throughput performance. Latency is measured in seconds per batch, throughput in sequences per second. -The complete list of available parameters for `scripts/train_benchmark.sh` script contains: +Training performance is measured with on-line speed perturbation and cuDNN benchmark mode enabled. +The script `scripts/train_benchmark.sh` loops over a number of batch sizes and GPU counts. +It extends `scripts/train.sh`, and the complete list of available parameters for `scripts/train_benchmark.sh` script contains: ```bash -DATA_DIR: directory of dataset.(default: '/datasets/LibriSpeech') -MODEL_CONFIG: model configuration. (default: 'configs/jasper10x5dr_sp_offline_specaugment.toml') -RESULT_DIR: directory for results and logs. (default: '/results') -CREATE_LOGFILE: boolean that indicates whether to create a log file that will be stored in `$RESULT_DIR`. (default: true) -CUDNN_BENCHMARK: boolean that indicates whether to enable cudnn benchmark mode for using more optimized kernels. (default: true) -NUM_GPUS: number of GPUs to use. (default: 8) -AMP: if set to `true`, enables automatic mixed precision with AMP (default: false) -NUM_STEPS: number of training iterations. If -1 runs full training for 400 epochs. (default: -1) -MAX_DURATION: filters out input audio data that exceed a maximum number of seconds. This ensures that when all filtered audio samples are padded to maximum length that length will stay under this specified threshold (default: 16.7) -SEED: seed for random number generator and useful for ensuring reproducibility. (default: 0) -BATCH_SIZE: data batch size.(default: 32) -LEARNING_RATE: Initial learning rate. (default: 0.015) -GRADIENT_ACCUMULATION_STEPS: number of gradient accumulation steps until optimizer updates weights. (default: 1) -PRINT_FREQUENCY: number of iterations after which training progress is printed. (default: 1) +BATCH_SIZE_SEQ: batch sizes to measre on. (defaul: "1 2 4 8 16") +NUM_GPUS_SEQ: number of GPUs to run the training on. (default: "1 4 8") +MODEL_CONFIG: (default: "configs/jasper10x5dr_speedp-online_train-benchmark.yaml") +TRAIN_MANIFESTS: (default: "$DATA_DIR/librispeech-train-clean-100-wav.json") +RESUME: (default: false) +EPOCHS_THIS_JOB: (default: 2) +EPOCHS: (default: 100000) +SAVE_FREQUENCY: (default: 100000) +EVAL_FREQUENCY: (default: 100000) +GRAD_ACCUMULATION_STEPS: (default: 1) +PAD_TO_MAX_DURATION: (default: true) +EMA: (default: 0) ``` ### Command-line options @@ -478,11 +499,10 @@ python inference.py --help ### Getting the data The Jasper model was trained on LibriSpeech dataset. We use the concatenation of `train-clean-100`, `train-clean-360` and `train-other-500` for training and `dev-clean` for validation. -This repository contains the `scripts/download_librispeech.sh` and `scripts/preprocess_librispeech.sh` scripts which will automatically download and preprocess the training, test and development datasets. By default, data will be downloaded to the `/datasets/LibriSpeech` directory, a minimum of 500GB free space is required for download and preprocessing, the final preprocessed dataset is 320GB. - +This repository contains the `scripts/download_librispeech.sh` and `scripts/preprocess_librispeech.sh` scripts which will automatically download and preprocess the training, test and development datasets. By default, data will be downloaded to the `/datasets/LibriSpeech` directory, a minimum of 250GB free space is required for download and preprocessing, the final preprocessed dataset is approximately 100GB. With offline speed perturbation, the dataset will be about 3x larger. #### Dataset guidelines -The `scripts/preprocess_librispeech.sh` script converts the input audio files to WAV format with a sample rate of 16kHz, target transcripts are stripped from whitespace characters, then lower-cased. For `train-clean-100`, `train-clean-360` and `train-other-500` it also creates speed perturbed versions with rates of 0.9 and 1.1 for data augmentation. +The `scripts/preprocess_librispeech.sh` script converts the input audio files to WAV format with a sample rate of 16kHz, target transcripts are stripped from whitespace characters, then lower-cased. For `train-clean-100`, `train-clean-360` and `train-other-500`. It can optionally create speed perturbed versions with rates of 0.9 and 1.1 for data augmentation. In the current version, those augmentations are applied on-line with the DALI pipeline without any impact on training time. After preprocessing, the script creates JSON files with output file paths, sample rate, target transcript and other metadata. These JSON files are used by the training script to identify training and validation datasets. @@ -500,21 +520,23 @@ Apart from the default arguments as listed in the [Parameters](#parameters) sect * Trains on the concatenation of all 3 LibriSpeech training datasets and evaluates on the LibriSpeech dev-clean dataset * Maintains an exponential moving average of parameters for evaluation * Has cudnn benchmark enabled -* Runs for 400 epochs -* Uses an initial learning rate of 0.015 and polynomial (quadratic) learning rate decay +* Runs for 440 epochs +* Uses an initial learning rate of 0.01 and an exponential learning rate decay * Saves a checkpoint every 10 epochs -* Runs evaluation on the development dataset every 100 iterations and at the end of training -* Prints out training progress every 25 iterations -* Creates a log file with training progress -* Uses offline speed perturbed data +* Automatically removes old checkpoints and preserves milestone checkpoints +* Runs evaluation on the development dataset every 544 iterations and at the end of training +* Maintains a separate checkpoint with the lowest WER on development set +* Prints out training progress every iteration to stdout +* Creates a DLLogger logfile and a Tensorboard log +* Calculates speed perturbation on-line during training * Uses SpecAugment in data pre-processing * Filters out audio samples longer than 16.7 seconds -* Pads each sequence in a batch to the same length (smallest multiple of 16 that is at least the length of the longest sequence in the batch) +* Pads each batch so its length would be divisible by 16 * Uses masked convolutions and dense residuals as described in the paper * Uses weight decay of 0.001 * Uses [Novograd](https://arxiv.org/pdf/1905.11286.pdf) as optimizer with betas=(0.95, 0) -Enabling AMP permits batch size 64 with one gradient accumulation step. Such setup will match the greedy WER [Results](#results) of the Jasper paper on a DGX-1 with 32GB V100 GPUs. +Enabling AMP permits batch size 64 with one gradient accumulation step. In the current setup it will improve upon the greedy WER [Results](#results) of the Jasper paper on a DGX-1 with 32GB V100 GPUs. ### Inference process Inference is performed using the `inference.py` script along with parameters defined in `scripts/inference.sh`. @@ -525,7 +547,7 @@ Apart from the default arguments as listed in the [Parameters](#parameters) sect * Uses a batch size of 64 * Runs for 1 epoch and prints out the final word error rate * Creates a log file with progress and results which will be stored in the results folder -* Pads each sequence in a batch to the same length (smallest multiple of 16 that is at least the length of the longest sequence in the batch +* Pads each batch so its length would be divisible by 16 * Does not use data augmentation * Does greedy decoding and saves the transcription in the results folder * Has the option to save the model output tensors for more complex decoding, for example, beam search @@ -533,24 +555,14 @@ Apart from the default arguments as listed in the [Parameters](#parameters) sect ### Evaluation process Evaluation is performed using the `inference.py` script along with parameters defined in `scripts/evaluation.sh`. -The `scripts/evaluation.sh` script runs a job on a single GPU, taking a pre-trained Jasper model checkpoint and running it on the specified dataset. -Apart from the default arguments as listed in the [Parameters](#parameters) section, by default the evaluation script: +The setup is similar to `scripts/inference.sh`, with two differences: -* Uses a batch size of 64 -* Evaluates the LibriSpeech dev-clean dataset -* Runs for 1 epoch and prints out the final word error rate -* Creates a log file with progress and results which is saved in the results folder -* Pads each sequence in a batch to the same length (smallest multiple of 16 that is at least the length of the longest sequence in the batch) -* Does not use data augmentation -* Has cudnn benchmark disabled - -### Deploying Jasper using TensorRT -NVIDIA TensorRT is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. Jasper’s architecture, which is of deep convolutional nature, is designed to facilitate fast GPU inference. After optimizing the compute-intensive acoustic model with NVIDIA TensorRT, inference throughput increased by up to 1.8x over native PyTorch. -More information on how to perform inference using TensorRT and speed up comparison between TensorRT and native PyTorch can be found in the subfolder [./trt/README.md](trt/README.md) +* Evaluates the LibriSpeech test-other dataset +* Model outputs are not saved ### Deploying Jasper using Triton Inference Server The NVIDIA Triton Inference Server provides a datacenter and cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any number of GPU or CPU models being managed by the server. -More information on how to perform inference using TensorRT Inference Server with different model backends can be found in the subfolder [./trtis/README.md](trtis/README.md) +More information on how to perform inference using Triton Inference Server with different model backends can be found in the subfolder [./triton/README.md](triton/README.md) ## Performance @@ -559,162 +571,121 @@ More information on how to perform inference using TensorRT Inference Server wit The following section shows how to run benchmarks measuring the model performance in training and inference modes. #### Training performance benchmark -To benchmark the training performance on a specific batch size and audio length, for `NUM_STEPS` run: +To benchmark the training performance in a specific setting on the `train-clean-100` subset of LibriSpeech, run: ```bash -export NUM_STEPS= -export MAX_DURATION= -export BATCH_SIZE= -bash scripts/train_benchmark.sh +BATCH_SIZE_SEQ= NUM_GPUS_SEQ= bash scripts/train_benchmark.sh ``` -By default, this script runs 400 epochs on the configuration `configs/jasper10x5dr_sp_offline_specaugment.toml` -using batch size 32 on a single node with 8x GPUs with at least 32GB of memory. -By default, `NUM_STEPS=-1` means training is run for 400 EPOCHS. If `$NUM_STEPS > 0` is specified, training is only run for a user-defined number of iterations. Audio samples longer than `MAX_DURATION` are filtered out, the remaining ones are padded to this duration such that all batches have the same length. At the end of training the script saves the model checkpoint to the results folder, runs evaluation on LibriSpeech dev-clean dataset, and prints out information such as average training latency performance in seconds, average trng throughput in sequences per second, final training loss, final training WER, evaluation loss and evaluation WER. - +By default, this script runs 2 epochs on the configuration `configs/jasper10x5dr_speedp-online_train-benchmark.yaml`, +which applies gentle speed perturbation that does not change the length of the output, enabling immediate stabilization of training step times in the cuDNN benchmark mode. The script runs benchmarks on batch sizes 32 on 1, 4, and 8 GPUs, and requires a 8x 32GB GPU machine. #### Inference performance benchmark To benchmark the inference performance on a specific batch size and audio length, run: ```bash -bash scripts/inference_benchmark.sh +BATCH_SIZE_SEQ= MAX_DURATION_SEQ= bash scripts/inference_benchmark.sh ``` -By default, the script runs on a single GPU and evaluates on the entire dataset using the model configuration `configs/jasper10x5dr_sp_offline_specaugment.toml` and batch size 32. -By default, `MAX_DURATION` is set to 36 seconds, which covers the maximum audio length. All audio samples are padded to this length. The script prints out `MAX_DURATION`, `BATCH_SIZE` and latency performance in milliseconds per batch. +By default, the script runs on a single GPU and evaluates on the dataset limited to utterances shorter than MAX_DURATION. It uses the model configuration `configs/jasper10x5dr_speedp-online_speca.yaml`. -Adjustments can be made with env variables, e.g., -```bash -export SEED=42 -export BATCH_SIZE=1 -bash scripts/inference_benchmark.sh -``` ### Results The following sections provide details on how we achieved our performance and accuracy in training and inference. All results are trained on 960 hours of LibriSpeech with a maximum audio length of 16.7s. The training is evaluated -on LibriSpeech dev-clean, dev-other, test-clean, test-other. -The results for Jasper Large's word error rate from the original paper after greedy decoding are shown below: - -| **Number of GPUs** | **dev-clean WER** | **dev-other WER**| **test-clean WER**| **test-other WER** -|--- |--- |--- |--- |--- | -|8 | 3.64| 11.89| 3.86 | 11.95 - +on LibriSpeech dev-clean, dev-other, test-clean, test-other. Checkpoints for evaluation are being chosen based on their +word error rate on dev-clean. #### Training accuracy results -##### Training accuracy: NVIDIA DGX A100 (8x A100 40GB) -Our results were obtained by running the `scripts/train.sh` training script in the 20.06-py3 NGC container on NVIDIA DGX A100 (8x A100 40GB) GPUs. +##### Training accuracy: NVIDIA DGX A100 (8x A100 80GB) +Our results were obtained by running the `scripts/train.sh` training script in the PyTorch 20.10-py3 NGC container with NVIDIA DGX A100 with (8x A100 80GB) GPUs. +The following table reports the word error rate (WER) of the acoustic model with greedy decoding on all LibriSpeech dev and test datasets for mixed precision training. -| **Number of GPUs** | **Batch size per GPU** | **Precision** | **dev-clean WER** | **dev-other WER** | **test-clean WER** | **test-other WER** | **Time to train** | **Time to train speedup (TF32 to mixed precision)** | -|-----|-----|-------|-------|-------|------|-------|-------|-----| -| 8 | 64 | mixed | 3.53 | 11.11 | 3.75 | 11.07 | 60 h | 1.9 | -| 8 | 64 | TF32 | 3.55 | 11.30 | 3.81 | 11.17 | 115 h | - | - -For each precision, we show the best of 8 runs chosen based on dev-clean WER. For TF32, two gradient accumulation steps have been used. +| Number of GPUs | Batch size per GPU | Precision | dev-clean WER | dev-other WER | test-clean WER | test-other WER | Time to train | +|-----|-----|-------|-------|-------|------|-------|------| +| 8 | 64 | mixed | 3.20 | 9.78 | 3.41 | 9.71 | 70 h | ##### Training accuracy: NVIDIA DGX-1 (8x V100 32GB) -Our results were obtained by running the `scripts/train.sh` training script in the PyTorch 20.06-py3 NGC container with NVIDIA DGX-1 with (8x V100 32GB) GPUs. -The following tables report the word error rate(WER) of the acoustic model with greedy decoding on all LibriSpeech dev and test datasets for mixed precision training. +Our results were obtained by running the `scripts/train.sh` training script in the PyTorch 20.10-py3 NGC container with NVIDIA DGX-1 with (8x V100 32GB) GPUs. +The following table reports the word error rate (WER) of the acoustic model with greedy decoding on all LibriSpeech dev and test datasets for mixed precision training. -| **Number of GPUs** | **Batch size per GPU** | **Precision** | **dev-clean WER** | **dev-other WER** | **test-clean WER** | **test-other WER** | **Time to train** | **Time to train speedup (FP32 to mixed precision)** | -|-----|-----|-------|-------|-------|------|-------|-------|-----| -| 8 | 64 | mixed | 3.49 | 11.22 | 3.74 | 10.94 | 105 h | 3.1 | -| 8 | 64 | FP32 | 3.65 | 11.47 | 3.86 | 11.30 | 330 h | - | +| Number of GPUs | Batch size per GPU | Precision | dev-clean WER | dev-other WER | test-clean WER | test-other WER | Time to train | +|-----|-----|-------|-------|-------|------|-------|-------| +| 8 | 64 | mixed | 3.26 | 10.00 | 3.54 | 9.80 | 130 h | We show the best of 5 runs (mixed precision) and 2 runs (FP32) chosen based on dev-clean WER. For FP32, two gradient accumulation steps have been used. ##### Training stability test The following table compares greedy decoding word error rates across 8 different training runs with different seeds for mixed precision training. -| **DGX A100, FP16, 8x GPU** | **Seed #1** | **Seed #2** | **Seed #3** | **Seed #4** | **Seed #5** | **Seed #6** | **Seed #7** | **Seed #8** | **Mean** | **Std** | -|-----------:|------:|------:|------:|------:|------:|------:|------:|------:|------:|-----:| -| dev-clean | 3.69 | 3.71 | 3.64 | 3.53 | 3.71 | 3.66 | 3.77 | 3.70 | 3.68 | 0.07 | -| dev-other | 11.39 | 11.65 | 11.46 | 11.11 | 11.23 | 11.18 | 11.43 | 11.60 | 11.38 | 0.19 | -| test-clean | 3.97 | 3.96 | 3.81 | 3.75 | 3.90 | 3.82 | 3.93 | 3.82 | 3.87 | 0.08 | -| test-other | 11.27 | 11.34 | 11.40 | 11.07 | 11.24 | 11.29 | 11.58 | 11.58 | 11.35 | 0.17 | +| DGX A100 80GB, FP16, 8x GPU | Seed #1 | Seed #2 | Seed #3 | Seed #4 | Seed #5 | Seed #6 | Seed #7 | Seed #8 | Mean | Std | +|-----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|-------:|------:| +| dev-clean | 3.46 | 3.55 | 3.45 | 3.44 | 3.25 | 3.34 | 3.20 | 3.40 | 3.39 | 0.11 | +| dev-other | 10.30 | 10.77 | 10.36 | 10.26 | 9.99 | 10.18 | 9.78 | 10.32 | 10.25 | 0.27 | +| test-clean | 3.84 | 3.81 | 3.66 | 3.64 | 3.58 | 3.55 | 3.41 | 3.73 | 3.65 | 0.13 | +| test-other | 10.61 | 10.52 | 10.49 | 10.47 | 9.89 | 10.09 | 9.71 | 10.26 | 10.26 | 0.31 | -| **DGX A100, TF32, 8x GPU** | **Seed #1** | **Seed #2** | **Seed #3** | **Seed #4** | **Seed #5** | **Seed #6** | **Seed #7** | **Seed #8** | **Mean** | **Std** | -|-----------:|------:|------:|------:|------:|------:|------:|------:|------:|------:|-----:| -| dev-clean | 3.56 | 3.60 | 3.60 | 3.55 | 3.65 | 3.57 | 3.89 | 3.67 | 3.64 | 0.11 | -| dev-other | 11.27 | 11.41 | 11.65 | 11.30 | 11.51 | 11.11 | 12.18 | 11.50 | 11.49 | 0.32 | -| test-clean | 3.80 | 3.79 | 3.88 | 3.81 | 3.94 | 3.82 | 4.13 | 3.85 | 3.88 | 0.11 | -| test-other | 11.40 | 11.26 | 11.47 | 11.17 | 11.36 | 11.16 | 12.15 | 11.46 | 11.43 | 0.32 | -| **DGX-1 32GB, FP16, 8x GPU** | **Seed #1** | **Seed #2** | **Seed #3** | **Seed #4** | **Seed #5** | **Mean** | **Std** | -|-----------:|------:|------:|------:|------:|------:|------:|-----:| -| dev-clean | 3.69 | 3.75 | 3.63 | 3.86 | 3.49 | 3.68 | 0.14 | -| dev-other | 11.35 | 11.63 | 11.60 | 11.68 | 11.22 | 11.50 | 0.20 | -| test-clean | 3.90 | 3.84 | 3.94 | 3.96 | 3.74 | 3.88 | 0.09 | -| test-other | 11.17 | 11.45 | 11.31 | 11.60 | 10.94 | 11.29 | 0.26 | +| DGX-1 32GB, FP16, 8x GPU | Seed #1 | Seed #2 | Seed #3 | Seed #4 | Seed #5 | Seed #6 | Seed #7 | Seed #8 | Mean | Std | +|-----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|-------:|------:| +| dev-clean | 3.31 | 3.31 | 3.26 | 3.44 | 3.40 | 3.35 | 3.36 | 3.28 | 3.34 | 0.06 | +| dev-other | 10.02 | 10.01 | 10.00 | 10.06 | 10.05 | 10.03 | 10.10 | 10.04 | 10.04 | 0.03 | +| test-clean | 3.49 | 3.50 | 3.54 | 3.61 | 3.57 | 3.58 | 3.48 | 3.51 | 3.54 | 0.04 | +| test-other | 10.11 | 10.14 | 9.80 | 10.09 | 10.17 | 9.99 | 9.86 | 10.00 | 10.02 | 0.13 | #### Training performance results -Our results were obtained by running the `scripts/train.sh` training script in the PyTorch 20.06-py3 NGC container. Performance (in sequences per second) is the steady-state throughput. +Our results were obtained by running the `scripts/train.sh` training script in the PyTorch 20.10-py3 NGC container. Performance (in sequences per second) is the steady-state throughput. -##### Training performance: NVIDIA DGX A100 (8x A100 40GB) -| **GPUs** | **Batch size / GPU** | **Throughput - TF32** | **Throughput - mixed precision** | **Throughput speedup (TF32 to mixed precision)** | **Weak scaling - TF32** | **Weak scaling - mixed precision** | -|--:|---:|------:|-------:|-----:|-----:|-----:| -| 1 | 32 | 36.09 | 69.33 | 1.92 | 1.00 | 1.00 | -| 4 | 32 | 143.05 | 264.91 | 1.85 | 3.96 | 3.82 | -| 8 | 32 | 285.25 | 524.33 | 1.84 | 7.90 | 7.56 | - -| **GPUs** | **Batch size / GPU** | **Throughput - TF32** | **Throughput - mixed precision** | **Throughput speedup (TF32 to mixed precision)** | **Weak scaling - TF32** | **Weak scaling - mixed precision** | -|--:|---:|------:|-------:|-----:|-----:|-----:| -| 1 | 64 | - | 77.79 | - | - | 1.00 | -| 4 | 64 | - | 304.32 | - | - | 3.91 | -| 8 | 64 | - | 602.88 | - | - | 7.75 | +##### Training performance: NVIDIA DGX A100 (8x A100 80GB) +| Batch size / GPU | GPUs | Throughput - TF32 | Throughput - mixed precision | Throughput speedup (TF32 to mixed precision) | Weak scaling - TF32 | Weak scaling - mixed precision | +|----:|----:|-------:|-------:|-----:|-----:|-----:| +| 32 | 1 | 42.18 | 64.32 | 1.52 | 1.00 | 1.00 | +| 32 | 4 | 157.49 | 239.23 | 1.52 | 3.73 | 3.72 | +| 32 | 8 | 310.10 | 470.09 | 1.52 | 7.35 | 7.31 | +| 64 | 1 | 49.64 | 75.59 | 1.52 | 1.00 | 1.00 | +| 64 | 4 | 192.66 | 289.16 | 1.50 | 3.88 | 3.83 | +| 64 | 8 | 371.41 | 547.91 | 1.48 | 7.48 | 7.25 | Note: Mixed precision permits higher batch sizes during training. We report the maximum batch sizes (as powers of 2), which are allowed without gradient accumulation. To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ##### Training performance: NVIDIA DGX-1 (8x V100 16GB) -| **GPUs** | **Batch size / GPU** | **Throughput - FP32** | **Throughput - mixed precision** | **Throughput speedup (FP32 to mixed precision)** | **Weak scaling - FP32** | **Weak scaling - mixed precision** | -|--:|---:|------:|-------:|-----:|-----:|-----:| -| 1 | 16 | 11.12 | 28.87 | 2.60 | 1.00 | 1.00 | -| 4 | 16 | 42.39 | 109.40 | 2.58 | 3.81 | 3.79 | -| 8 | 16 | 84.45 | 194.30 | 2.30 | 7.59 | 6.73 | - -| **GPUs** | **Batch size / GPU** | **Throughput - FP32** | **Throughput - mixed precision** | **Throughput speedup (FP32 to mixed precision)** | **Weak scaling - FP32** | **Weak scaling - mixed precision** | -|--:|---:|------:|-------:|-----:|-----:|-----:| -| 1 | 32 | - | 37.57 | - | - | 1.00 | -| 4 | 32 | - | 134.80 | - | - | 3.59 | -| 8 | 32 | - | 276.14 | - | - | 7.35 | +| Batch size / GPU | GPUs | Throughput - FP32 | Throughput - mixed precision | Throughput speedup (FP32 to mixed precision) | Weak scaling - FP32 | Weak scaling - mixed precision | +|----:|----:|------:|-------:|-----:|-----:|-----:| +| 16 | 1 | 10.71 | 27.87 | 2.60 | 1.00 | 1.00 | +| 16 | 4 | 40.28 | 99.80 | 2.48 | 3.76 | 3.58 | +| 16 | 8 | 78.23 | 193.89 | 2.48 | 7.30 | 6.96 | Note: Mixed precision permits higher batch sizes during training. We report the maximum batch sizes (as powers of 2), which are allowed without gradient accumulation. To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ##### Training performance: NVIDIA DGX-1 (8x V100 32GB) -| **GPUs** | **Batch size / GPU** | **Throughput - FP32** | **Throughput - mixed precision** | **Throughput speedup (FP32 to mixed precision)** | **Weak scaling - FP32** | **Weak scaling - mixed precision** | -|--:|---:|------:|-------:|-----:|-----:|-----:| -| 1 | 32 | 13.15 | 35.63 | 2.71 | 1.00 | 1.00 | -| 4 | 32 | 51.21 | 134.01 | 2.62 | 3.90 | 3.76 | -| 8 | 32 | 99.88 | 247.97 | 2.48 | 7.60 | 6.96 | - -| **GPUs** | **Batch size / GPU** | **Throughput - FP32** | **Throughput - mixed precision** | **Throughput speedup (FP32 to mixed precision)** | **Weak scaling - FP32** | **Weak scaling - mixed precision** | -|--:|---:|------:|-------:|-----:|-----:|-----:| -| 1 | 64 | - | 41.74 | - | - | 1.00 | -| 4 | 64 | - | 158.44 | - | - | 3.80 | -| 8 | 64 | - | 312.22 | - | - | 7.48 | +| Batch size / GPU | GPUs | Throughput - FP32 | Throughput - mixed precision | Throughput speedup (FP32 to mixed precision) | Weak scaling - FP32 | Weak scaling - mixed precision | +|----:|----:|------:|-------:|-----:|-----:|-----:| +| 32 | 1 | 12.22 | 34.08 | 2.79 | 1.00 | 1.00 | +| 32 | 4 | 46.97 | 128.39 | 2.73 | 3.84 | 3.77 | +| 32 | 8 | 92.44 | 249.00 | 2.69 | 7.57 | 7.31 | +| 64 | 1 | N/A | 39.30 | N/A | N/A | 1.00 | +| 64 | 4 | N/A | 150.18 | N/A | N/A | 3.82 | +| 64 | 8 | N/A | 282.68 | N/A | N/A | 7.19 | Note: Mixed precision permits higher batch sizes during training. We report the maximum batch sizes (as powers of 2), which are allowed without gradient accumulation. To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ##### Training performance: NVIDIA DGX-2 (16x V100 32GB) -| **GPUs** | **Batch size / GPU** | **Throughput - FP32** | **Throughput - mixed precision** | **Throughput speedup (FP32 to mixed precision)** | **Weak scaling - FP32** | **Weak scaling - mixed precision** | -|---:|---:|-------:|-------:|-----:|------:|------:| -| 1 | 32 | 14.13 | 41.05 | 2.90 | 1.00 | 1.00 | -| 4 | 32 | 54.32 | 156.47 | 2.88 | 3.84 | 3.81 | -| 8 | 32 | 110.26 | 307.13 | 2.79 | 7.80 | 7.48 | -| 16 | 32 | 218.14 | 561.85 | 2.58 | 15.44 | 13.69 | - -| **GPUs** | **Batch size / GPU** | **Throughput - FP32** | **Throughput - mixed precision** | **Throughput speedup (FP32 to mixed precision)** | **Weak scaling - FP32** | **Weak scaling - mixed precision** | -|---:|---:|-------:|-------:|-----:|------:|------:| -| 1 | 64 | - | 46.41 | - | - | 1.00 | -| 4 | 64 | - | 147.90 | - | - | 3.19 | -| 8 | 64 | - | 359.15 | - | - | 7.74 | -| 16 | 64 | - | 703.13 | - | - | 15.15 | +| Batch size / GPU | GPUs | Throughput - FP32 | Throughput - mixed precision | Throughput speedup (FP32 to mixed precision) | Weak scaling - FP32 | Weak scaling - mixed precision | +|----:|----:|-------:|-------:|-----:|------:|------:| +| 32 | 1 | 13.46 | 38.94 | 2.89 | 1.00 | 1.00 | +| 32 | 4 | 51.38 | 143.44 | 2.79 | 3.82 | 3.68 | +| 32 | 8 | 100.54 | 280.48 | 2.79 | 7.47 | 7.20 | +| 32 | 16 | 188.14 | 515.90 | 2.74 | 13.98 | 13.25 | +| 64 | 1 | N/A | 43.86 | N/A | N/A | 1.00 | +| 64 | 4 | N/A | 165.27 | N/A | N/A | 3.77 | +| 64 | 8 | N/A | 318.10 | N/A | N/A | 7.25 | +| 64 | 16 | N/A | 567.47 | N/A | N/A | 12.94 | Note: Mixed precision permits higher batch sizes during training. We report the maximum batch sizes (as powers of 2), which are allowed without gradient accumulation. @@ -722,121 +693,130 @@ To achieve these same results, follow the [Quick Start Guide](#quick-start-guide #### Inference performance results -Our results were obtained by running the `scripts/inference_benchmark.sh` script in the PyTorch 20.06-py3 NGC container on NVIDIA DGX A100, DGX-1, DGX-2 and T4 on a single GPU. Performance numbers (latency in milliseconds per batch) were averaged over 1000 iterations. +Our results were obtained by running the `scripts/inference_benchmark.sh` script in the PyTorch 20.10-py3 NGC container on NVIDIA DGX A100, DGX-1, DGX-2 and T4 on a single GPU. Performance numbers (latency in milliseconds per batch) were averaged over 500 iterations. -##### Inference performance: NVIDIA DGX A100 (1x A100 40GB) -| | |FP16 Latency (ms) Percentiles | | | | TF32 Latency (ms) Percentiles | | | | FP16/TF32 speed up | -|---:|-------------:|------:|------:|------:|------:|-------:|-------:|-------:|-------:|-----:| -| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | -| 1 | 2 | 36.31 | 36.85 | 43.18 | 35.96 | 41.16 | 41.63 | 47.90 | 40.89 | 1.14 | -| 2 | 2 | 37.56 | 43.32 | 45.23 | 37.11 | 42.53 | 47.79 | 49.62 | 42.07 | 1.13 | -| 4 | 2 | 43.10 | 44.85 | 47.22 | 41.43 | 47.88 | 49.75 | 51.55 | 43.25 | 1.04 | -| 8 | 2 | 44.02 | 44.30 | 45.21 | 39.51 | 50.14 | 50.47 | 51.50 | 45.63 | 1.16 | -| 16 | 2 | 48.04 | 48.38 | 49.12 | 42.76 | 70.90 | 71.22 | 72.50 | 60.78 | 1.42 | -| 1 | 7 | 37.74 | 37.88 | 38.92 | 37.02 | 41.53 | 42.17 | 44.75 | 40.79 | 1.10 | -| 2 | 7 | 40.91 | 41.11 | 42.35 | 40.02 | 46.44 | 46.80 | 49.67 | 45.67 | 1.14 | -| 4 | 7 | 43.94 | 44.32 | 46.71 | 43.00 | 54.39 | 54.80 | 56.63 | 53.53 | 1.24 | -| 8 | 7 | 50.01 | 50.19 | 52.92 | 48.62 | 68.55 | 69.25 | 72.28 | 67.61 | 1.39 | -| 16 | 7 | 60.38 | 60.76 | 62.44 | 57.92 | 93.17 | 94.15 | 98.84 | 92.21 | 1.59 | -| 1 | 16.7 | 41.39 | 41.75 | 43.62 | 40.73 | 45.79 | 46.10 | 47.76 | 45.21 | 1.11 | -| 2 | 16.7 | 46.43 | 46.76 | 47.72 | 45.81 | 52.53 | 53.13 | 55.60 | 51.71 | 1.13 | -| 4 | 16.7 | 50.88 | 51.68 | 54.74 | 50.11 | 66.29 | 66.96 | 70.45 | 65.00 | 1.30 | -| 8 | 16.7 | 62.09 | 62.76 | 65.08 | 61.40 | 94.16 | 94.67 | 97.46 | 93.00 | 1.51 | -| 16 | 16.7 | 75.22 | 76.86 | 80.76 | 73.99 | 139.51 | 140.88 | 144.10 | 137.94 | 1.86 | +##### Inference performance: NVIDIA DGX A100 (1x A100 80GB) +| | | FP16 Latency (ms) Percentiles | | | | TF32 Latency (ms) Percentiles | | | | FP16/TF32 speed up | +|-----:|---------------:|------:|------:|------:|------:|------:|------:|-------:|------:|------:| +| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | +| 1 | 2.0 | 32.40 | 32.50 | 32.82 | 32.30 | 33.30 | 33.64 | 34.65 | 33.25 | 1.03 | +| 2 | 2.0 | 32.90 | 33.51 | 34.35 | 32.69 | 34.48 | 34.65 | 35.66 | 34.27 | 1.05 | +| 4 | 2.0 | 32.85 | 33.01 | 33.89 | 32.60 | 34.09 | 34.46 | 35.22 | 34.00 | 1.04 | +| 8 | 2.0 | 35.51 | 35.89 | 37.10 | 35.33 | 34.86 | 35.36 | 36.08 | 34.45 | 0.98 | +| 16 | 2.0 | 36.00 | 36.57 | 37.40 | 35.77 | 43.83 | 44.12 | 44.77 | 43.39 | 1.21 | +| 1 | 7.0 | 33.50 | 33.99 | 34.91 | 33.03 | 33.83 | 34.25 | 34.95 | 33.70 | 1.02 | +| 2 | 7.0 | 34.43 | 34.89 | 35.72 | 34.22 | 34.41 | 34.73 | 35.69 | 34.28 | 1.00 | +| 4 | 7.0 | 34.30 | 34.59 | 35.43 | 34.07 | 37.95 | 38.18 | 38.87 | 37.55 | 1.10 | +| 8 | 7.0 | 35.98 | 36.28 | 37.11 | 35.28 | 44.64 | 44.79 | 45.37 | 44.29 | 1.26 | +| 16 | 7.0 | 39.86 | 40.08 | 41.16 | 39.33 | 55.17 | 55.46 | 57.24 | 54.56 | 1.39 | +| 1 | 16.7 | 35.20 | 35.80 | 38.71 | 34.36 | 35.36 | 35.76 | 36.55 | 34.64 | 1.01 | +| 2 | 16.7 | 35.40 | 35.81 | 36.50 | 34.76 | 36.34 | 36.53 | 37.40 | 35.87 | 1.03 | +| 4 | 16.7 | 36.01 | 36.38 | 37.37 | 35.57 | 44.69 | 45.09 | 45.88 | 43.92 | 1.23 | +| 8 | 16.7 | 41.48 | 41.78 | 44.22 | 40.69 | 58.57 | 58.74 | 59.62 | 58.11 | 1.43 | +| 16 | 16.7 | 61.37 | 61.93 | 66.32 | 60.92 | 97.33 | 97.71 | 100.04 | 96.56 | 1.59 | + +To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ##### Inference performance: NVIDIA DGX-1 (1x V100 16GB) -| | |FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | -|---:|-------------:|------:|------:|------:|------:|-------:|-------:|-------:|-------:|-----:| -| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | -| 1 | 2 | 52.26 | 59.93 | 66.62 | 50.34 | 70.90 | 76.47 | 79.84 | 68.61 | 1.36 | -| 2 | 2 | 62.04 | 67.68 | 70.91 | 58.65 | 75.72 | 80.15 | 83.50 | 71.33 | 1.22 | -| 4 | 2 | 75.12 | 77.12 | 82.80 | 66.55 | 80.88 | 82.60 | 86.63 | 73.65 | 1.11 | -| 8 | 2 | 71.62 | 72.99 | 81.10 | 66.39 | 99.57 | 101.43 | 107.16 | 92.34 | 1.39 | -| 16 | 2 | 78.51 | 80.33 | 87.31 | 72.91 | 104.79 | 107.22 | 114.21 | 96.18 | 1.32 | -| 1 | 7 | 52.67 | 54.40 | 64.27 | 50.47 | 73.86 | 75.61 | 84.93 | 72.08 | 1.43 | -| 2 | 7 | 60.49 | 62.41 | 72.87 | 58.45 | 93.07 | 94.51 | 102.40 | 91.55 | 1.57 | -| 4 | 7 | 70.55 | 72.95 | 82.59 | 68.43 | 131.48 | 137.60 | 149.06 | 129.23 | 1.89 | -| 8 | 7 | 83.91 | 85.28 | 93.08 | 76.40 | 152.49 | 157.92 | 166.80 | 150.49 | 1.97 | -| 16 | 7 | 100.21 | 103.12 | 109.00 | 96.31 | 178.45 | 181.46 | 187.20 | 174.33 | 1.81 | -| 1 | 16.7 | 56.84 | 60.05 | 66.54 | 54.69 | 109.55 | 111.19 | 120.40 | 102.25 | 1.87 | -| 2 | 16.7 | 69.39 | 70.97 | 75.34 | 67.39 | 149.93 | 150.79 | 154.06 | 147.45 | 2.19 | -| 4 | 16.7 | 87.48 | 93.96 | 102.73 | 85.09 | 211.78 | 219.66 | 232.99 | 208.38 | 2.45 | -| 8 | 16.7 | 106.91 | 111.92 | 116.55 | 104.13 | 246.92 | 250.94 | 268.44 | 243.34 | 2.34 | -| 16 | 16.7 | 149.08 | 153.86 | 166.17 | 146.28 | 292.84 | 298.02 | 313.04 | 288.54 | 1.97 | +| | | FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | +|-----:|---------------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|------:| +| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | +| 1 | 2.0 | 45.42 | 45.62 | 49.54 | 45.02 | 48.83 | 48.99 | 51.66 | 48.44 | 1.08 | +| 2 | 2.0 | 50.31 | 50.53 | 53.66 | 49.10 | 49.87 | 50.04 | 52.99 | 49.41 | 1.01 | +| 4 | 2.0 | 49.17 | 49.48 | 52.13 | 48.73 | 52.92 | 53.21 | 55.28 | 52.31 | 1.07 | +| 8 | 2.0 | 51.20 | 51.40 | 52.32 | 49.01 | 73.02 | 73.30 | 75.00 | 71.99 | 1.47 | +| 16 | 2.0 | 51.75 | 52.24 | 56.36 | 51.27 | 83.99 | 84.57 | 86.69 | 83.24 | 1.62 | +| 1 | 7.0 | 48.13 | 48.53 | 50.95 | 46.78 | 48.52 | 48.75 | 50.89 | 48.01 | 1.03 | +| 2 | 7.0 | 49.52 | 50.10 | 52.35 | 48.00 | 65.27 | 65.41 | 66.59 | 64.79 | 1.35 | +| 4 | 7.0 | 51.75 | 52.01 | 54.39 | 50.38 | 93.75 | 94.77 | 97.04 | 92.27 | 1.83 | +| 8 | 7.0 | 54.80 | 56.27 | 66.23 | 52.95 | 130.65 | 131.09 | 132.91 | 129.82 | 2.45 | +| 16 | 7.0 | 73.02 | 73.42 | 75.83 | 71.96 | 157.53 | 158.20 | 160.73 | 155.51 | 2.16 | +| 1 | 16.7 | 48.10 | 48.52 | 52.71 | 47.20 | 73.34 | 73.56 | 74.19 | 72.69 | 1.54 | +| 2 | 16.7 | 64.21 | 64.52 | 65.56 | 56.06 | 129.48 | 129.97 | 131.78 | 126.36 | 2.25 | +| 4 | 16.7 | 60.38 | 61.03 | 63.18 | 58.87 | 183.33 | 183.85 | 185.53 | 181.90 | 3.09 | +| 8 | 16.7 | 85.88 | 86.34 | 87.70 | 84.46 | 227.42 | 228.21 | 229.63 | 225.71 | 2.67 | +| 16 | 16.7 | 135.62 | 136.40 | 137.69 | 131.58 | 276.90 | 277.59 | 281.16 | 275.08 | 2.09 | To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ##### Inference performance: NVIDIA DGX-1 (1x V100 32GB) -| | |FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | -|---:|-------------:|------:|------:|------:|------:|-------:|-------:|-------:|-------:|-----:| -| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | -| 1 | 2 | 64.60 | 67.34 | 79.87 | 60.73 | 84.69 | 86.78 | 96.02 | 79.32 | 1.31 | -| 2 | 2 | 71.52 | 73.32 | 82.00 | 63.93 | 85.33 | 87.65 | 96.34 | 78.09 | 1.22 | -| 4 | 2 | 80.38 | 84.62 | 93.09 | 74.95 | 90.29 | 97.59 | 100.61 | 84.44 | 1.13 | -| 8 | 2 | 83.43 | 85.51 | 91.17 | 74.09 | 107.28 | 111.89 | 115.19 | 98.76 | 1.33 | -| 16 | 2 | 90.01 | 90.81 | 96.48 | 79.85 | 115.39 | 116.95 | 123.71 | 103.26 | 1.29 | -| 1 | 7 | 53.74 | 54.09 | 56.67 | 53.07 | 86.07 | 86.55 | 91.59 | 78.79 | 1.48 | -| 2 | 7 | 63.34 | 63.67 | 66.08 | 62.62 | 96.25 | 96.82 | 99.72 | 95.44 | 1.52 | -| 4 | 7 | 80.35 | 80.86 | 83.80 | 73.41 | 132.19 | 132.94 | 135.59 | 131.46 | 1.79 | -| 8 | 7 | 77.68 | 78.11 | 86.71 | 75.72 | 156.30 | 157.72 | 165.55 | 154.87 | 2.05 | -| 16 | 7 | 103.52 | 106.66 | 111.93 | 98.15 | 180.71 | 182.82 | 191.12 | 178.61 | 1.82 | -| 1 | 16.7 | 57.58 | 57.79 | 59.75 | 56.58 | 104.51 | 104.87 | 108.01 | 104.04 | 1.84 | -| 2 | 16.7 | 69.19 | 69.58 | 71.49 | 68.58 | 151.25 | 152.07 | 155.21 | 149.30 | 2.18 | -| 4 | 16.7 | 87.17 | 88.53 | 97.41 | 86.56 | 211.28 | 212.41 | 214.97 | 208.54 | 2.41 | -| 8 | 16.7 | 116.25 | 116.90 | 120.14 | 109.21 | 247.63 | 248.93 | 254.77 | 245.19 | 2.25 | -| 16 | 16.7 | 151.99 | 154.79 | 163.36 | 149.80 | 293.99 | 296.05 | 303.04 | 291.00 | 1.94 | +| | | FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | +|-----:|---------------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|------:| +| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | +| 1 | 2.0 | 52.74 | 53.01 | 54.40 | 51.47 | 55.97 | 56.22 | 57.93 | 54.93 | 1.07 | +| 2 | 2.0 | 51.77 | 52.15 | 54.69 | 50.98 | 56.58 | 56.87 | 58.88 | 55.35 | 1.09 | +| 4 | 2.0 | 51.41 | 51.76 | 53.47 | 50.55 | 61.56 | 61.87 | 63.81 | 60.74 | 1.20 | +| 8 | 2.0 | 51.83 | 52.15 | 54.08 | 50.85 | 80.20 | 80.69 | 81.67 | 77.69 | 1.53 | +| 16 | 2.0 | 70.48 | 70.96 | 72.11 | 62.98 | 93.00 | 93.44 | 94.17 | 89.05 | 1.41 | +| 1 | 7.0 | 49.77 | 50.21 | 51.88 | 48.73 | 52.74 | 52.99 | 54.54 | 51.67 | 1.06 | +| 2 | 7.0 | 51.12 | 51.47 | 52.84 | 49.98 | 65.33 | 65.63 | 67.07 | 64.64 | 1.29 | +| 4 | 7.0 | 53.13 | 53.56 | 55.68 | 52.15 | 93.54 | 93.85 | 94.72 | 92.76 | 1.78 | +| 8 | 7.0 | 57.67 | 58.07 | 59.89 | 56.41 | 133.93 | 134.18 | 134.88 | 133.15 | 2.36 | +| 16 | 7.0 | 76.09 | 76.48 | 79.13 | 75.27 | 162.35 | 162.77 | 164.63 | 161.30 | 2.14 | +| 1 | 16.7 | 54.78 | 55.29 | 56.83 | 52.51 | 75.37 | 76.27 | 78.05 | 74.32 | 1.42 | +| 2 | 16.7 | 56.80 | 57.20 | 59.01 | 55.49 | 130.60 | 131.36 | 132.93 | 128.55 | 2.32 | +| 4 | 16.7 | 64.19 | 64.84 | 66.47 | 62.87 | 188.09 | 188.76 | 190.07 | 185.76 | 2.95 | +| 8 | 16.7 | 87.46 | 87.86 | 89.99 | 86.47 | 232.33 | 232.89 | 234.43 | 230.44 | 2.67 | +| 16 | 16.7 | 136.02 | 136.52 | 139.44 | 134.78 | 283.87 | 284.59 | 286.70 | 282.01 | 2.09 | To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ##### Inference performance: NVIDIA DGX-2 (1x V100 32GB) -| | |FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | -|---:|-------------:|------:|------:|------:|------:|-------:|-------:|-------:|-------:|-----:| -| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | -| 1 | 2 | 47.25 | 48.24 | 50.28 | 41.53 | 67.03 | 68.15 | 70.17 | 61.82 | 1.49 | -| 2 | 2 | 54.11 | 55.20 | 60.44 | 48.82 | 69.11 | 70.38 | 75.93 | 64.45 | 1.32 | -| 4 | 2 | 63.82 | 67.64 | 71.58 | 61.47 | 71.51 | 74.55 | 79.31 | 67.85 | 1.10 | -| 8 | 2 | 64.78 | 65.86 | 67.68 | 59.07 | 90.84 | 91.99 | 94.10 | 84.28 | 1.43 | -| 16 | 2 | 70.59 | 71.49 | 73.58 | 63.85 | 96.92 | 97.58 | 99.98 | 87.73 | 1.37 | -| 1 | 7 | 42.35 | 42.55 | 43.50 | 41.08 | 63.87 | 64.02 | 64.73 | 62.54 | 1.52 | -| 2 | 7 | 47.82 | 48.04 | 49.43 | 46.79 | 81.17 | 81.43 | 82.28 | 80.02 | 1.71 | -| 4 | 7 | 58.27 | 58.54 | 59.69 | 56.96 | 116.00 | 116.46 | 118.79 | 114.82 | 2.02 | -| 8 | 7 | 62.88 | 63.62 | 67.16 | 61.47 | 143.90 | 144.34 | 147.36 | 139.54 | 2.27 | -| 16 | 7 | 88.04 | 88.57 | 90.96 | 82.84 | 163.04 | 164.04 | 167.30 | 161.36 | 1.95 | -| 1 | 16.7 | 44.54 | 44.86 | 45.86 | 43.53 | 88.10 | 88.41 | 89.37 | 87.21 | 2.00 | -| 2 | 16.7 | 55.21 | 55.55 | 56.92 | 54.33 | 134.99 | 135.69 | 137.87 | 132.97 | 2.45 | -| 4 | 16.7 | 72.93 | 73.58 | 74.95 | 72.02 | 193.50 | 194.21 | 196.04 | 191.24 | 2.66 | -| 8 | 16.7 | 96.94 | 97.66 | 99.58 | 92.73 | 227.70 | 228.74 | 231.59 | 225.35 | 2.43 | -| 16 | 16.7 | 138.25 | 139.75 | 143.71 | 133.82 | 273.69 | 274.53 | 279.50 | 269.13 | 2.01 | +| | | FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | +|-----:|---------------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|------:| +| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | +| 1 | 2.0 | 35.88 | 36.12 | 39.80 | 35.20 | 42.95 | 43.67 | 46.65 | 42.23 | 1.20 | +| 2 | 2.0 | 36.36 | 36.57 | 40.97 | 35.60 | 41.83 | 42.21 | 45.60 | 40.97 | 1.15 | +| 4 | 2.0 | 36.69 | 36.89 | 41.25 | 36.05 | 48.35 | 48.52 | 52.35 | 47.80 | 1.33 | +| 8 | 2.0 | 37.49 | 37.70 | 41.37 | 36.88 | 65.41 | 65.64 | 66.50 | 64.96 | 1.76 | +| 16 | 2.0 | 41.35 | 41.79 | 45.58 | 40.91 | 77.22 | 77.51 | 79.48 | 76.54 | 1.87 | +| 1 | 7.0 | 36.07 | 36.55 | 40.31 | 35.62 | 39.52 | 39.84 | 43.07 | 38.93 | 1.09 | +| 2 | 7.0 | 37.42 | 37.66 | 41.36 | 36.79 | 55.94 | 56.19 | 58.33 | 55.60 | 1.51 | +| 4 | 7.0 | 38.51 | 38.95 | 42.55 | 37.98 | 86.62 | 87.08 | 87.50 | 86.20 | 2.27 | +| 8 | 7.0 | 42.82 | 43.00 | 47.11 | 42.55 | 122.05 | 122.29 | 122.70 | 121.59 | 2.86 | +| 16 | 7.0 | 67.74 | 67.92 | 69.05 | 65.69 | 149.92 | 150.16 | 151.03 | 149.49 | 2.28 | +| 1 | 16.7 | 39.28 | 39.78 | 43.34 | 38.35 | 66.73 | 67.16 | 69.80 | 66.01 | 1.72 | +| 2 | 16.7 | 43.05 | 43.42 | 47.18 | 42.43 | 120.04 | 121.12 | 123.32 | 118.14 | 2.78 | +| 4 | 16.7 | 52.18 | 52.49 | 56.11 | 51.63 | 176.09 | 176.51 | 178.70 | 174.60 | 3.38 | +| 8 | 16.7 | 78.55 | 78.79 | 81.66 | 78.04 | 216.19 | 216.68 | 217.63 | 214.48 | 2.75 | +| 16 | 16.7 | 125.57 | 125.92 | 128.78 | 124.33 | 264.11 | 264.49 | 266.14 | 262.80 | 2.11 | To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ##### Inference performance: NVIDIA T4 -| | |FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | -|---:|-------------:|------:|------:|------:|------:|-------:|-------:|-------:|-------:|-----:| -| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | -| 1 | 2 | 64.13 | 65.25 | 76.11 | 59.08 | 94.69 | 98.23 | 109.86 | 89.00 | 1.51 | -| 2 | 2 | 67.59 | 70.77 | 84.06 | 57.47 | 103.88 | 105.37 | 114.59 | 93.30 | 1.62 | -| 4 | 2 | 75.19 | 81.05 | 87.01 | 65.79 | 120.73 | 128.29 | 146.83 | 112.96 | 1.72 | -| 8 | 2 | 74.15 | 77.69 | 84.96 | 62.77 | 161.97 | 163.46 | 170.25 | 153.07 | 2.44 | -| 16 | 2 | 100.62 | 105.08 | 113.00 | 82.06 | 216.18 | 217.92 | 222.46 | 188.57 | 2.30 | -| 1 | 7 | 77.88 | 79.61 | 81.90 | 70.22 | 110.37 | 113.93 | 121.39 | 107.17 | 1.53 | -| 2 | 7 | 81.09 | 83.94 | 87.28 | 78.06 | 148.30 | 151.21 | 158.55 | 141.26 | 1.81 | -| 4 | 7 | 99.85 | 100.83 | 104.24 | 96.81 | 229.94 | 232.34 | 238.11 | 225.43 | 2.33 | -| 8 | 7 | 147.38 | 150.37 | 153.66 | 142.64 | 394.26 | 396.35 | 398.89 | 390.77 | 2.74 | -| 16 | 7 | 280.32 | 281.37 | 282.74 | 278.01 | 484.20 | 485.74 | 499.89 | 482.67 | 1.74 | -| 1 | 16.7 | 76.97 | 79.78 | 81.61 | 75.55 | 171.45 | 176.90 | 179.18 | 167.95 | 2.22 | -| 2 | 16.7 | 96.48 | 99.42 | 101.21 | 92.74 | 276.12 | 278.67 | 282.06 | 270.05 | 2.91 | -| 4 | 16.7 | 129.63 | 131.67 | 134.42 | 124.55 | 522.23 | 524.79 | 527.32 | 509.75 | 4.09 | -| 8 | 16.7 | 209.64 | 211.36 | 214.66 | 204.83 | 706.84 | 709.21 | 715.57 | 697.97 | 3.41 | -| 16 | 16.7 | 342.23 | 344.62 | 350.84 | 337.42 | 848.02 | 849.83 | 858.22 | 834.38 | 2.47 | +| | | FP16 Latency (ms) Percentiles | | | | FP32 Latency (ms) Percentiles | | | | FP16/FP32 speed up | +|-----:|---------------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|------:| +| BS | Duration (s) | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | Avg | +| 1 | 2.0 | 43.62 | 46.95 | 50.46 | 37.23 | 51.31 | 52.37 | 56.21 | 49.77 | 1.34 | +| 2 | 2.0 | 49.09 | 50.46 | 53.11 | 40.61 | 81.85 | 82.22 | 83.94 | 80.81 | 1.99 | +| 4 | 2.0 | 47.71 | 51.14 | 55.09 | 41.29 | 112.56 | 115.13 | 118.56 | 111.60 | 2.70 | +| 8 | 2.0 | 51.37 | 53.11 | 55.48 | 45.94 | 198.95 | 199.48 | 200.28 | 197.22 | 4.29 | +| 16 | 2.0 | 63.59 | 64.30 | 66.90 | 61.77 | 221.75 | 222.07 | 223.22 | 220.09 | 3.56 | +| 1 | 7.0 | 47.49 | 48.66 | 53.36 | 40.76 | 73.63 | 74.41 | 77.65 | 72.41 | 1.78 | +| 2 | 7.0 | 48.63 | 50.01 | 58.35 | 43.44 | 114.66 | 115.28 | 117.63 | 112.41 | 2.59 | +| 4 | 7.0 | 52.19 | 52.85 | 54.22 | 49.94 | 200.38 | 201.29 | 202.97 | 197.21 | 3.95 | +| 8 | 7.0 | 84.90 | 85.56 | 87.52 | 83.41 | 404.00 | 404.72 | 405.70 | 400.25 | 4.80 | +| 16 | 7.0 | 157.12 | 157.58 | 159.19 | 155.01 | 490.93 | 492.09 | 493.44 | 486.45 | 3.14 | +| 1 | 16.7 | 50.57 | 51.57 | 57.58 | 46.27 | 150.39 | 151.84 | 153.54 | 147.31 | 3.18 | +| 2 | 16.7 | 63.64 | 64.55 | 66.31 | 61.98 | 256.54 | 258.16 | 262.71 | 250.34 | 4.04 | +| 4 | 16.7 | 140.44 | 141.06 | 142.00 | 138.14 | 519.59 | 521.41 | 523.86 | 512.74 | 3.71 | +| 8 | 16.7 | 267.03 | 268.06 | 270.01 | 263.15 | 727.33 | 728.61 | 731.36 | 722.62 | 2.75 | +| 16 | 16.7 | 362.40 | 364.02 | 367.80 | 358.75 | 867.92 | 869.19 | 871.46 | 860.37 | 2.40 | To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above. ## Release notes ### Changelog +February 2021 +* Added DALI data-processing pipeline for on-the-fly data processing and augmentation on CPU or GPU +* Revised training recipe: ~10% relative improvement in Word Error Rate (WER) +* Updated Triton scripts for compatibility with Triton V2 API, updated Triton inference results +* Refactored codebase +* Updated performance results for the PyTorch 20.10-py3 NGC container + June 2020 -- Updated performance tables to include A100 results +* Updated performance tables to include A100 results December 2019 * Inference support for TRT 6 with dynamic shapes diff --git a/PyTorch/SpeechRecognition/Jasper/parts/__init__.py b/PyTorch/SpeechRecognition/Jasper/common/__init__.py similarity index 100% rename from PyTorch/SpeechRecognition/Jasper/parts/__init__.py rename to PyTorch/SpeechRecognition/Jasper/common/__init__.py diff --git a/PyTorch/SpeechRecognition/Jasper/parts/segment.py b/PyTorch/SpeechRecognition/Jasper/common/audio.py similarity index 65% rename from PyTorch/SpeechRecognition/Jasper/parts/segment.py rename to PyTorch/SpeechRecognition/Jasper/common/audio.py index b0698394..916394f5 100644 --- a/PyTorch/SpeechRecognition/Jasper/parts/segment.py +++ b/PyTorch/SpeechRecognition/Jasper/common/audio.py @@ -12,13 +12,28 @@ # See the License for the specific language governing permissions and # limitations under the License. -import numpy as np -import librosa +import random import soundfile as sf +import librosa +import torch +import numpy as np + +import sox + + +def audio_from_file(file_path, offset=0, duration=0, trim=False, target_sr=16000): + audio = AudioSegment(file_path, target_sr=target_sr, int_values=False, + offset=offset, duration=duration, trim=trim) + + samples = torch.tensor(audio.samples, dtype=torch.float).cuda() + num_samples = torch.tensor(samples.shape[0]).int().cuda() + return (samples.unsqueeze(0), num_samples.unsqueeze(0)) + class AudioSegment(object): """Monaural audio segment abstraction. + :param samples: Audio samples [num_samples x num_channels]. :type samples: ndarray.float32 :param sample_rate: Audio sample rate. @@ -26,11 +41,30 @@ class AudioSegment(object): :raises TypeError: If the sample data type is not float or int. """ - def __init__(self, samples, sample_rate, target_sr=None, trim=False, - trim_db=60): + def __init__(self, filename, target_sr=None, int_values=False, offset=0, + duration=0, trim=False, trim_db=60): """Create audio segment from samples. + Samples are convert float32 internally, with int scaled to [-1, 1]. + Load a file supported by librosa and return as an AudioSegment. + :param filename: path of file to load + :param target_sr: the desired sample rate + :param int_values: if true, load samples as 32-bit integers + :param offset: offset in seconds when loading audio + :param duration: duration in seconds when loading audio + :return: numpy array of samples """ + with sf.SoundFile(filename, 'r') as f: + dtype = 'int32' if int_values else 'float32' + sample_rate = f.samplerate + if offset > 0: + f.seek(int(offset * sample_rate)) + if duration > 0: + samples = f.read(int(duration * sample_rate), dtype=dtype) + else: + samples = f.read(dtype=dtype) + samples = samples.transpose() + samples = self._convert_samples_to_float32(samples) if target_sr is not None and target_sr != sample_rate: samples = librosa.core.resample(samples, sample_rate, target_sr) @@ -67,6 +101,7 @@ class AudioSegment(object): @staticmethod def _convert_samples_to_float32(samples): """Convert sample type to float32. + Audio sample type is usually integer or float-point. Integers will be scaled to [-1, 1] in float32. """ @@ -80,30 +115,6 @@ class AudioSegment(object): raise TypeError("Unsupported sample type: %s." % samples.dtype) return float32_samples - @classmethod - def from_file(cls, filename, target_sr=None, int_values=False, offset=0, - duration=0, trim=False): - """ - Load a file supported by librosa and return as an AudioSegment. - :param filename: path of file to load - :param target_sr: the desired sample rate - :param int_values: if true, load samples as 32-bit integers - :param offset: offset in seconds when loading audio - :param duration: duration in seconds when loading audio - :return: numpy array of samples - """ - with sf.SoundFile(filename, 'r') as f: - dtype = 'int32' if int_values else 'float32' - sample_rate = f.samplerate - if offset > 0: - f.seek(int(offset * sample_rate)) - if duration > 0: - samples = f.read(int(duration * sample_rate), dtype=dtype) - else: - samples = f.read(dtype=dtype) - samples = samples.transpose() - return cls(samples, sample_rate, target_sr=target_sr, trim=trim) - @property def samples(self): return self._samples.copy() @@ -129,9 +140,11 @@ class AudioSegment(object): self._samples *= 10. ** (gain / 20.) def pad(self, pad_size, symmetric=False): - """Add zero padding to the sample. The pad size is given in number of samples. - If symmetric=True, `pad_size` will be added to both sides. If false, `pad_size` - zeros will be added only to the end. + """Add zero padding to the sample. + + The pad size is given in number of samples. If symmetric=True, + `pad_size` will be added to both sides. If false, `pad_size` zeros + will be added only to the end. """ self._samples = np.pad(self._samples, (pad_size if symmetric else 0, pad_size), @@ -139,6 +152,7 @@ class AudioSegment(object): def subsegment(self, start_time=None, end_time=None): """Cut the AudioSegment between given boundaries. + Note that this is an in-place transformation. :param start_time: Beginning of subsegment in seconds. :type start_time: float @@ -168,3 +182,66 @@ class AudioSegment(object): start_sample = int(round(start_time * self._sample_rate)) end_sample = int(round(end_time * self._sample_rate)) self._samples = self._samples[start_sample:end_sample] + + +class Perturbation: + def __init__(self, p=0.1, rng=None): + self.p = p + self._rng = random.Random() if rng is None else rng + + def maybe_apply(self, segment, sample_rate=None): + if self._rng.random() < self.p: + self(segment, sample_rate) + + +class SpeedPerturbation(Perturbation): + def __init__(self, min_rate=0.85, max_rate=1.15, discrete=False, p=0.1, rng=None): + super(SpeedPerturbation, self).__init__(p, rng) + assert 0 < min_rate < max_rate + self.min_rate = min_rate + self.max_rate = max_rate + self.discrete = discrete + + def __call__(self, data, sample_rate): + if self.discrete: + rate = np.random.choice([self.min_rate, None, self.max_rate]) + else: + rate = self._rng.uniform(self.min_rate, self.max_rate) + + if rate is not None: + data._samples = sox.Transformer().speed(factor=rate).build_array( + input_array=data._samples, sample_rate_in=sample_rate) + + +class GainPerturbation(Perturbation): + def __init__(self, min_gain_dbfs=-10, max_gain_dbfs=10, p=0.1, rng=None): + super(GainPerturbation, self).__init__(p, rng) + self._rng = random.Random() if rng is None else rng + self._min_gain_dbfs = min_gain_dbfs + self._max_gain_dbfs = max_gain_dbfs + + def __call__(self, data, sample_rate=None): + del sample_rate + gain = self._rng.uniform(self._min_gain_dbfs, self._max_gain_dbfs) + data._samples = data._samples * (10. ** (gain / 20.)) + + +class ShiftPerturbation(Perturbation): + def __init__(self, min_shift_ms=-5.0, max_shift_ms=5.0, p=0.1, rng=None): + super(ShiftPerturbation, self).__init__(p, rng) + self._min_shift_ms = min_shift_ms + self._max_shift_ms = max_shift_ms + + def __call__(self, data, sample_rate): + shift_ms = self._rng.uniform(self._min_shift_ms, self._max_shift_ms) + if abs(shift_ms) / 1000 > data.duration: + # TODO: do something smarter than just ignore this condition + return + shift_samples = int(shift_ms * data.sample_rate // 1000) + # print("DEBUG: shift:", shift_samples) + if shift_samples < 0: + data._samples[-shift_samples:] = data._samples[:shift_samples] + data._samples[:-shift_samples] = 0 + elif shift_samples > 0: + data._samples[:-shift_samples] = data._samples[shift_samples:] + data._samples[-shift_samples:] = 0 diff --git a/PyTorch/SpeechRecognition/Jasper/common/dali/__init__.py b/PyTorch/SpeechRecognition/Jasper/common/dali/__init__.py new file mode 100644 index 00000000..ff800034 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/dali/__init__.py @@ -0,0 +1,13 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/PyTorch/SpeechRecognition/Jasper/common/dali/data_loader.py b/PyTorch/SpeechRecognition/Jasper/common/dali/data_loader.py new file mode 100644 index 00000000..fed9b192 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/dali/data_loader.py @@ -0,0 +1,158 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import math +import numpy as np +import torch.distributed as dist +from .iterator import DaliJasperIterator, SyntheticDataIterator +from .pipeline import DaliPipeline +from common.helpers import print_once + + +def _parse_json(json_path: str, start_label=0, predicate=lambda json: True): + """ + Parses json file to the format required by DALI + Args: + json_path: path to json file + start_label: the label, starting from which DALI will assign consecutive int numbers to every transcript + predicate: function, that accepts a sample descriptor (i.e. json dictionary) as an argument. + If the predicate for a given sample returns True, it will be included in the dataset. + + Returns: + output_files: dictionary, that maps file name to label assigned by DALI + transcripts: dictionary, that maps label assigned by DALI to the transcript + """ + import json + global cnt + with open(json_path) as f: + librispeech_json = json.load(f) + output_files = {} + transcripts = {} + curr_label = start_label + for original_sample in librispeech_json: + if not predicate(original_sample): + continue + transcripts[curr_label] = original_sample['transcript'] + output_files[original_sample['files'][-1]['fname']] = curr_label + curr_label += 1 + return output_files, transcripts + + +def _dict_to_file(dict: dict, filename: str): + with open(filename, "w") as f: + for key, value in dict.items(): + f.write("{} {}\n".format(key, value)) + + +class DaliDataLoader: + """ + DataLoader is the main entry point to the data preprocessing pipeline. + To use, create an object and then just iterate over `data_iterator`. + DataLoader will do the rest for you. + Example: + data_layer = DataLoader(DaliTrainPipeline, path, json, bs, ngpu) + data_it = data_layer.data_iterator + for data in data_it: + print(data) # Here's your preprocessed data + + Args: + device_type: Which device to use for preprocessing. Choose: "cpu", "gpu" + pipeline_type: Choose: "train", "val", "synth" + """ + + def __init__(self, gpu_id, dataset_path: str, config_data: dict, config_features: dict, json_names: list, + symbols: list, batch_size: int, pipeline_type: str, grad_accumulation_steps: int = 1, + synth_iters_per_epoch: int = 544, device_type: str = "gpu"): + import torch + self.batch_size = batch_size + self.grad_accumulation_steps = grad_accumulation_steps + self.drop_last = (pipeline_type == 'train') + self.device_type = device_type + pipeline_type = self._parse_pipeline_type(pipeline_type) + if pipeline_type == "synth": + self._dali_data_iterator = self._init_synth_iterator(self.batch_size, config_features['nfilt'], + iters_per_epoch=synth_iters_per_epoch, + ngpus=torch.distributed.get_world_size()) + else: + self._dali_data_iterator = self._init_iterator(gpu_id=gpu_id, dataset_path=dataset_path, + config_data=config_data, + config_features=config_features, + json_names=json_names, symbols=symbols, + train_pipeline=pipeline_type == "train") + + def _init_iterator(self, gpu_id, dataset_path, config_data, config_features, json_names: list, symbols: list, + train_pipeline: bool): + """ + Returns data iterator. Data underneath this operator is preprocessed within Dali + """ + + def hash_list_of_strings(li): + return str(abs(hash(''.join(li)))) + + output_files, transcripts = {}, {} + max_duration = config_data['max_duration'] + for jname in json_names: + of, tr = _parse_json(jname if jname[0] == '/' else os.path.join(dataset_path, jname), len(output_files), + predicate=lambda json: json['original_duration'] <= max_duration) + output_files.update(of) + transcripts.update(tr) + file_list_path = os.path.join("/tmp", "jasper_dali.file_list." + hash_list_of_strings(json_names)) + _dict_to_file(output_files, file_list_path) + self.dataset_size = len(output_files) + print_once(f"Dataset read by DALI. Number of samples: {self.dataset_size}") + + pipeline = DaliPipeline.from_config(config_data=config_data, config_features=config_features, device_id=gpu_id, + file_root=dataset_path, file_list=file_list_path, + device_type=self.device_type, batch_size=self.batch_size, + train_pipeline=train_pipeline) + + return DaliJasperIterator([pipeline], transcripts=transcripts, symbols=symbols, batch_size=self.batch_size, + shard_size=self._shard_size(), train_iterator=train_pipeline) + + def _init_synth_iterator(self, batch_size, nfeatures, iters_per_epoch, ngpus): + self.dataset_size = ngpus * iters_per_epoch * batch_size + return SyntheticDataIterator(batch_size, nfeatures, regenerate=True) + + @staticmethod + def _parse_pipeline_type(pipeline_type): + pipe = pipeline_type.lower() + assert pipe in ("train", "val", "synth"), 'Invalid pipeline type (choices: "train", "val", "synth").' + return pipe + + def _shard_size(self): + """ + Total number of samples handled by a single GPU in a single epoch. + """ + world_size = dist.get_world_size() if dist.is_initialized() else 1 + if self.drop_last: + divisor = world_size * self.batch_size * self.grad_accumulation_steps + return self.dataset_size // divisor * divisor // world_size + else: + return int(math.ceil(self.dataset_size / world_size)) + + def __len__(self): + """ + Number of batches handled by each GPU. + """ + if self.drop_last: + assert self._shard_size() % self.batch_size == 0, f'{self._shard_size()} {self.batch_size}' + + return int(math.ceil(self._shard_size() / self.batch_size)) + + def data_iterator(self): + return self._dali_data_iterator + + def __iter__(self): + return self._dali_data_iterator diff --git a/PyTorch/SpeechRecognition/Jasper/common/dali/iterator.py b/PyTorch/SpeechRecognition/Jasper/common/dali/iterator.py new file mode 100644 index 00000000..d93b2958 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/dali/iterator.py @@ -0,0 +1,162 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +import torch.distributed as dist +import numpy as np +from common.helpers import print_once +from common.text import _clean_text, punctuation_map + + +def normalize_string(s, symbols, punct_map): + """ + Normalizes string. + Example: + 'call me at 8:00 pm!' -> 'call me at eight zero pm' + """ + labels = set(symbols) + try: + text = _clean_text(s, ["english_cleaners"], punct_map).strip() + return ''.join([tok for tok in text if all(t in labels for t in tok)]) + except Exception as e: + print_once("WARNING: Normalizing failed: {s} {e}") + + +class DaliJasperIterator(object): + """ + Returns batches of data for Jasper training: + preprocessed_signal, preprocessed_signal_length, transcript, transcript_length + + This iterator is not meant to be the entry point to Dali processing pipeline. + Use DataLoader instead. + """ + + def __init__(self, dali_pipelines, transcripts, symbols, batch_size, shard_size, train_iterator: bool): + self.transcripts = transcripts + self.symbols = symbols + self.batch_size = batch_size + from nvidia.dali.plugin.pytorch import DALIGenericIterator + from nvidia.dali.plugin.base_iterator import LastBatchPolicy + + # in train pipeline shard_size is set to divisable by batch_size, so PARTIAL policy is safe + self.dali_it = DALIGenericIterator( + dali_pipelines, ["audio", "label", "audio_shape"], size=shard_size, + dynamic_shape=True, auto_reset=True, last_batch_padded=True, + last_batch_policy=LastBatchPolicy.PARTIAL) + + @staticmethod + def _str2list(s: str): + """ + Returns list of floats, that represents given string. + '0.' denotes separator + '1.' denotes 'a' + '27.' denotes "'" + Assumes, that the string is lower case. + """ + list = [] + for c in s: + if c == "'": + list.append(27.) + else: + list.append(max(0., ord(c) - 96.)) + return list + + @staticmethod + def _pad_lists(lists: list, pad_val=0): + """ + Pads lists, so that all have the same size. + Returns list with actual sizes of corresponding input lists + """ + max_length = 0 + sizes = [] + for li in lists: + sizes.append(len(li)) + max_length = max_length if len(li) < max_length else len(li) + for li in lists: + li += [pad_val] * (max_length - len(li)) + return sizes + + def _gen_transcripts(self, labels, normalize_transcripts: bool = True): + """ + Generate transcripts in format expected by NN + """ + lists = [ + self._str2list(normalize_string(self.transcripts[lab.item()], self.symbols, punctuation_map(self.symbols))) + for lab in labels + ] if normalize_transcripts else [self._str2list(self.transcripts[lab.item()]) for lab in labels] + sizes = self._pad_lists(lists) + return torch.tensor(lists).cuda(), torch.tensor(sizes, dtype=torch.int32).cuda() + + def __next__(self): + data = self.dali_it.__next__() + transcripts, transcripts_lengths = self._gen_transcripts(data[0]["label"]) + return data[0]["audio"], data[0]["audio_shape"][:, 1], transcripts, transcripts_lengths + + def next(self): + return self.__next__() + + def __iter__(self): + return self + + +# TODO: refactor +class SyntheticDataIterator(object): + def __init__(self, batch_size, nfeatures, feat_min=-5., feat_max=0., txt_min=0., txt_max=23., feat_lens_max=1760, + txt_lens_max=231, regenerate=False): + """ + Args: + batch_size + nfeatures: number of features for melfbanks + feat_min: minimum value in `feat` tensor, used for randomization + feat_max: maximum value in `feat` tensor, used for randomization + txt_min: minimum value in `txt` tensor, used for randomization + txt_max: maximum value in `txt` tensor, used for randomization + regenerate: If True, regenerate random tensors for every iterator step. + If False, generate them only at start. + """ + self.batch_size = batch_size + self.nfeatures = nfeatures + self.feat_min = feat_min + self.feat_max = feat_max + self.feat_lens_max = feat_lens_max + self.txt_min = txt_min + self.txt_max = txt_max + self.txt_lens_max = txt_lens_max + self.regenerate = regenerate + + if not self.regenerate: + self.feat, self.feat_lens, self.txt, self.txt_lens = self._generate_sample() + + def _generate_sample(self): + feat = (self.feat_max - self.feat_min) * np.random.random_sample( + (self.batch_size, self.nfeatures, self.feat_lens_max)) + self.feat_min + feat_lens = np.random.randint(0, int(self.feat_lens_max) - 1, size=self.batch_size) + txt = (self.txt_max - self.txt_min) * np.random.random_sample( + (self.batch_size, self.txt_lens_max)) + self.txt_min + txt_lens = np.random.randint(0, int(self.txt_lens_max) - 1, size=self.batch_size) + return torch.Tensor(feat).cuda(), \ + torch.Tensor(feat_lens).cuda(), \ + torch.Tensor(txt).cuda(), \ + torch.Tensor(txt_lens).cuda() + + def __next__(self): + if self.regenerate: + return self._generate_sample() + return self.feat, self.feat_lens, self.txt, self.txt_lens + + def next(self): + return self.__next__() + + def __iter__(self): + return self diff --git a/PyTorch/SpeechRecognition/Jasper/common/dali/pipeline.py b/PyTorch/SpeechRecognition/Jasper/common/dali/pipeline.py new file mode 100644 index 00000000..2150052d --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/dali/pipeline.py @@ -0,0 +1,397 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import nvidia.dali +import nvidia.dali.ops as ops +import nvidia.dali.types as types +import multiprocessing +import numpy as np +import torch +import math +import itertools + + +class DaliPipeline(nvidia.dali.pipeline.Pipeline): + def __init__(self, *, + train_pipeline: bool, # True if train pipeline, False if validation pipeline + device_id, + num_threads, + batch_size, + file_root: str, + file_list: str, + sample_rate, + discrete_resample_range: bool, + resample_range: list, + window_size, + window_stride, + nfeatures, + nfft, + frame_splicing_factor, + dither_coeff, + silence_threshold, + preemph_coeff, + pad_align, + max_duration, + mask_time_num_regions, + mask_time_min, + mask_time_max, + mask_freq_num_regions, + mask_freq_min, + mask_freq_max, + mask_both_num_regions, + mask_both_min_time, + mask_both_max_time, + mask_both_min_freq, + mask_both_max_freq, + preprocessing_device="gpu"): + super().__init__(batch_size, num_threads, device_id) + + self._dali_init_log(locals()) + + if torch.distributed.is_initialized(): + shard_id = torch.distributed.get_rank() + n_shards = torch.distributed.get_world_size() + else: + shard_id = 0 + n_shards = 1 + + self.preprocessing_device = preprocessing_device.lower() + assert self.preprocessing_device == "cpu" or self.preprocessing_device == "gpu", \ + "Incorrect preprocessing device. Please choose either 'cpu' or 'gpu'" + self.frame_splicing_factor = frame_splicing_factor + assert frame_splicing_factor == 1, "DALI doesn't support frame splicing operation" + + self.resample_range = resample_range + self.discrete_resample_range = discrete_resample_range + + self.train = train_pipeline + self.sample_rate = sample_rate + self.dither_coeff = dither_coeff + self.nfeatures = nfeatures + self.max_duration = max_duration + self.mask_params = { + 'time_num_regions': mask_time_num_regions, + 'time_min': mask_time_min, + 'time_max': mask_time_max, + 'freq_num_regions': mask_freq_num_regions, + 'freq_min': mask_freq_min, + 'freq_max': mask_freq_max, + 'both_num_regions': mask_both_num_regions, + 'both_min_time': mask_both_min_time, + 'both_max_time': mask_both_max_time, + 'both_min_freq': mask_both_min_freq, + 'both_max_freq': mask_both_max_freq, + } + self.do_remove_silence = True if silence_threshold is not None else False + + self.read = ops.FileReader(device="cpu", file_root=file_root, file_list=file_list, shard_id=shard_id, + num_shards=n_shards, shuffle_after_epoch=train_pipeline) + + # TODO change ExternalSource to Uniform for new DALI release + if discrete_resample_range and resample_range is not None: + self.speed_perturbation_coeffs = ops.ExternalSource(device="cpu", cycle=True, + source=self._discrete_resample_coeffs_generator) + elif resample_range is not None: + self.speed_perturbation_coeffs = ops.Uniform(device="cpu", range=resample_range) + else: + self.speed_perturbation_coeffs = None + + self.decode = ops.AudioDecoder(device="cpu", sample_rate=self.sample_rate if resample_range is None else None, + dtype=types.FLOAT, downmix=True) + + self.normal_distribution = ops.NormalDistribution(device=preprocessing_device) + + self.preemph = ops.PreemphasisFilter(device=preprocessing_device, preemph_coeff=preemph_coeff) + + self.spectrogram = ops.Spectrogram(device=preprocessing_device, nfft=nfft, + window_length=window_size * sample_rate, + window_step=window_stride * sample_rate) + + self.mel_fbank = ops.MelFilterBank(device=preprocessing_device, sample_rate=sample_rate, nfilter=self.nfeatures, + normalize=True) + + self.log_features = ops.ToDecibels(device=preprocessing_device, multiplier=np.log(10), reference=1.0, + cutoff_db=math.log(1e-20)) + + self.get_shape = ops.Shapes(device=preprocessing_device) + + self.normalize = ops.Normalize(device=preprocessing_device, axes=[1]) + + self.pad = ops.Pad(device=preprocessing_device, axes=[1], fill_value=0, align=pad_align) + + # Silence trimming + self.get_nonsilent_region = ops.NonsilentRegion(device="cpu", cutoff_db=silence_threshold) + self.trim_silence = ops.Slice(device="cpu", normalized_anchor=False, normalized_shape=False, axes=[0]) + self.to_float = ops.Cast(device="cpu", dtype=types.FLOAT) + + # Spectrogram masking + self.spectrogram_cutouts = ops.ExternalSource(source=self._cutouts_generator, num_outputs=2, cycle=True) + self.mask_spectrogram = ops.Erase(device=preprocessing_device, axes=[0, 1], fill_value=0, + normalized_anchor=True) + + @classmethod + def from_config(cls, train_pipeline: bool, device_id, batch_size, file_root: str, file_list: str, config_data: dict, + config_features: dict, device_type: str = "gpu", do_resampling: bool = True, + num_cpu_threads=multiprocessing.cpu_count()): + + max_duration = config_data['max_duration'] + sample_rate = config_data['sample_rate'] + silence_threshold = -60 if config_data['trim_silence'] else None + + # TODO Take into account resampling probablity + # TODO config_features['speed_perturbation']['p'] + + if do_resampling and config_data['speed_perturbation'] is not None: + resample_range = [config_data['speed_perturbation']['min_rate'], + config_data['speed_perturbation']['max_rate']] + discrete_resample_range = config_data['speed_perturbation']['discrete'] + else: + resample_range = None + discrete_resample_range = False + + window_size = config_features['window_size'] + window_stride = config_features['window_stride'] + nfeatures = config_features['n_filt'] + nfft = config_features['n_fft'] + frame_splicing_factor = config_features['frame_splicing'] + dither_coeff = config_features['dither'] + pad_align = config_features['pad_align'] + pad_to_max_duration = config_features['pad_to_max_duration'] + assert not pad_to_max_duration, "Padding to max duration currently not supported in DALI" + preemph_coeff = .97 + + config_spec = config_features['spec_augment'] + if config_spec is not None: + mask_time_num_regions = config_spec['time_masks'] + mask_time_min = config_spec['min_time'] + mask_time_max = config_spec['max_time'] + mask_freq_num_regions = config_spec['freq_masks'] + mask_freq_min = config_spec['min_freq'] + mask_freq_max = config_spec['max_freq'] + else: + mask_time_num_regions = 0 + mask_time_min = 0 + mask_time_max = 0 + mask_freq_num_regions = 0 + mask_freq_min = 0 + mask_freq_max = 0 + + config_cutout = config_features['cutout_augment'] + if config_cutout is not None: + mask_both_num_regions = config_cutout['masks'] + mask_both_min_time = config_cutout['min_time'] + mask_both_max_time = config_cutout['max_time'] + mask_both_min_freq = config_cutout['min_freq'] + mask_both_max_freq = config_cutout['max_freq'] + else: + mask_both_num_regions = 0 + mask_both_min_time = 0 + mask_both_max_time = 0 + mask_both_min_freq = 0 + mask_both_max_freq = 0 + + return cls(train_pipeline=train_pipeline, + device_id=device_id, + preprocessing_device=device_type, + num_threads=num_cpu_threads, + batch_size=batch_size, + file_root=file_root, + file_list=file_list, + sample_rate=sample_rate, + discrete_resample_range=discrete_resample_range, + resample_range=resample_range, + window_size=window_size, + window_stride=window_stride, + nfeatures=nfeatures, + nfft=nfft, + frame_splicing_factor=frame_splicing_factor, + dither_coeff=dither_coeff, + silence_threshold=silence_threshold, + preemph_coeff=preemph_coeff, + pad_align=pad_align, + max_duration=max_duration, + mask_time_num_regions=mask_time_num_regions, + mask_time_min=mask_time_min, + mask_time_max=mask_time_max, + mask_freq_num_regions=mask_freq_num_regions, + mask_freq_min=mask_freq_min, + mask_freq_max=mask_freq_max, + mask_both_num_regions=mask_both_num_regions, + mask_both_min_time=mask_both_min_time, + mask_both_max_time=mask_both_max_time, + mask_both_min_freq=mask_both_min_freq, + mask_both_max_freq=mask_both_max_freq) + + @staticmethod + def _dali_init_log(args: dict): + if (not torch.distributed.is_initialized() or ( + torch.distributed.is_initialized() and torch.distributed.get_rank() == 0)): # print once + max_len = max([len(ii) for ii in args.keys()]) + fmt_string = '\t%' + str(max_len) + 's : %s' + print('Initializing DALI with parameters:') + for keyPair in sorted(args.items()): + print(fmt_string % keyPair) + + @staticmethod + def _div_ceil(dividend, divisor): + return (dividend + (divisor - 1)) // divisor + + def _get_audio_len(self, inp): + return self.get_shape(inp) if self.frame_splicing_factor == 1 else \ + self._div_ceil(self.get_shape(inp), self.frame_splicing_factor) + + def _remove_silence(self, inp): + begin, length = self.get_nonsilent_region(inp) + out = self.trim_silence(inp, self.to_float(begin), self.to_float(length)) + return out + + def _do_spectrogram_masking(self): + return self.mask_params['time_num_regions'] > 0 or self.mask_params['freq_num_regions'] > 0 or \ + self.mask_params['both_num_regions'] > 0 + + @staticmethod + def _interleave_lists(*lists): + """ + [*, **, ***], [1, 2, 3], [a, b, c] -> [*, 1, a, **, 2, b, ***, 3, c] + Returns: + iterator over interleaved list + """ + assert all((len(lists[0]) == len(test_l) for test_l in lists)), "All lists have to have the same length" + return itertools.chain(*zip(*lists)) + + def _generate_cutouts(self): + """ + Returns: + Generates anchors and shapes of the cutout regions. + Single call generates one batch of data. + The output shall be passed to DALI's Erase operator + anchors = [f0 t0 f1 t1 ...] + shapes = [f0w t0h f1w t1h ...] + """ + MAX_TIME_DIMENSION = 20 * 16000 + freq_anchors = np.random.random(self.mask_params['freq_num_regions']) + time_anchors = np.random.random(self.mask_params['time_num_regions']) + both_anchors_freq = np.random.random(self.mask_params['both_num_regions']) + both_anchors_time = np.random.random(self.mask_params['both_num_regions']) + anchors = [] + for anch in freq_anchors: + anchors.extend([anch, 0]) + for anch in time_anchors: + anchors.extend([0, anch]) + for t, f in zip(both_anchors_time, both_anchors_freq): + anchors.extend([f, t]) + + shapes = [] + shapes.extend( + self._interleave_lists( + np.random.randint(self.mask_params['freq_min'], self.mask_params['freq_max'] + 1, + self.mask_params['freq_num_regions']), + # XXX: Here, a time dimension of the spectrogram shall be passed. + # However, in DALI ArgumentInput can't come from GPU. + # So we leave the job for Erase (masking operator) to get it together. + [int(MAX_TIME_DIMENSION)] * self.mask_params['freq_num_regions'] + ) + ) + shapes.extend( + self._interleave_lists( + [self.nfeatures] * self.mask_params['time_num_regions'], + np.random.randint(self.mask_params['time_min'], self.mask_params['time_max'] + 1, + self.mask_params['time_num_regions']) + ) + ) + shapes.extend( + self._interleave_lists( + np.random.randint(self.mask_params['both_min_freq'], self.mask_params['both_max_freq'] + 1, + self.mask_params['both_num_regions']), + np.random.randint(self.mask_params['both_min_time'], self.mask_params['both_max_time'] + 1, + self.mask_params['both_num_regions']) + ) + ) + return anchors, shapes + + def _discrete_resample_coeffs_generator(self): + """ + Generate resample coeffs from discrete set + """ + yield np.random.choice([self.resample_range[0], 1.0, self.resample_range[1]], + size=self.batch_size).astype('float32') + + def _cutouts_generator(self): + """ + Generator, that wraps cutouts creation in order to randomize inputs + and allow passing them to DALI's ExternalSource operator + """ + + def tuples2list(tuples: list): + """ + [(a, b), (c, d)] -> [[a, c], [b, d]] + """ + return map(list, zip(*tuples)) + + [anchors, shapes] = tuples2list([self._generate_cutouts() for _ in range(self.batch_size)]) + yield np.array(anchors, dtype=np.float32), np.array(shapes, dtype=np.float32) + + def define_graph(self): + audio, label = self.read() + if not self.train or self.speed_perturbation_coeffs is None: + audio, sr = self.decode(audio) + else: + resample_coeffs = self.speed_perturbation_coeffs() * self.sample_rate + audio, sr = self.decode(audio, sample_rate=resample_coeffs) + + if self.do_remove_silence: + audio = self._remove_silence(audio) + + # Max duration drop is performed at DataLayer stage + + if self.preprocessing_device == "gpu": + audio = audio.gpu() + + if self.dither_coeff != 0.: + audio = audio + self.normal_distribution(audio) * self.dither_coeff + + audio = self.preemph(audio) + + audio = self.spectrogram(audio) + audio = self.mel_fbank(audio) + audio = self.log_features(audio) + + audio_len = self._get_audio_len(audio) + + audio = self.normalize(audio) + audio = self.pad(audio) + + if self.train and self._do_spectrogram_masking(): + anchors, shapes = self.spectrogram_cutouts() + audio = self.mask_spectrogram(audio, anchor=anchors, shape=shapes) + + # When modifying DALI pipeline returns, make sure you update `output_map` in DALIGenericIterator invocation + return audio.gpu(), label.gpu(), audio_len.gpu() + + +class DaliTritonPipeline(DaliPipeline): + def __init__(self, **kwargs): + super().__init__(**kwargs) + assert not kwargs['train_pipeline'], "Pipeline for Triton shall be a validation pipeline" + if torch.distributed.is_initialized(): + raise RuntimeError( + "You're creating Triton pipeline, using multi-process mode. Please use single-process mode.") + self.read = ops.ExternalSource(name="DALI_INPUT_0", no_copy=True, device="cpu") + + +def serialize_dali_triton_pipeline(output_path: str, config_data: dict, config_features: dict): + pipe = DaliTritonPipeline.from_config(train_pipeline=False, device_id=-1, batch_size=-1, file_root=None, + file_list=None, config_data=config_data, config_features=config_features, + do_resampling=False, num_cpu_threads=-1) + pipe.serialize(filename=output_path) diff --git a/PyTorch/SpeechRecognition/Jasper/common/dataset.py b/PyTorch/SpeechRecognition/Jasper/common/dataset.py new file mode 100644 index 00000000..bb00d33e --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/dataset.py @@ -0,0 +1,234 @@ +# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +from pathlib import Path + +import numpy as np + +import torch +from torch.utils.data import Dataset, DataLoader +from torch.utils.data.distributed import DistributedSampler + +from .audio import (audio_from_file, AudioSegment, GainPerturbation, + ShiftPerturbation, SpeedPerturbation) +from .text import _clean_text, punctuation_map + + +def normalize_string(s, labels, punct_map): + """Normalizes string. + + Example: + 'call me at 8:00 pm!' -> 'call me at eight zero pm' + """ + labels = set(labels) + try: + text = _clean_text(s, ["english_cleaners"], punct_map).strip() + return ''.join([tok for tok in text if all(t in labels for t in tok)]) + except: + print(f"WARNING: Normalizing failed: {s}") + return None + + +class FilelistDataset(Dataset): + def __init__(self, filelist_fpath): + self.samples = [line.strip() for line in open(filelist_fpath, 'r')] + + def __len__(self): + return len(self.samples) + + def __getitem__(self, index): + audio, audio_len = audio_from_file(self.samples[index]) + return (audio.squeeze(0), audio_len, torch.LongTensor([0]), + torch.LongTensor([0])) + + +class SingleAudioDataset(FilelistDataset): + def __init__(self, audio_fpath): + self.samples = [audio_fpath] + + +class AudioDataset(Dataset): + def __init__(self, data_dir, manifest_fpaths, labels, + sample_rate=16000, min_duration=0.1, max_duration=float("inf"), + pad_to_max_duration=False, max_utts=0, normalize_transcripts=True, + sort_by_duration=False, trim_silence=False, + speed_perturbation=None, gain_perturbation=None, + shift_perturbation=None, ignore_offline_speed_perturbation=False): + """Loads audio, transcript and durations listed in a .json file. + + Args: + data_dir: absolute path to dataset folder + manifest_filepath: relative path from dataset folder + to manifest json as described above. Can be coma-separated paths. + labels (str): all possible output symbols + min_duration (int): skip audio shorter than threshold + max_duration (int): skip audio longer than threshold + pad_to_max_duration (bool): pad all sequences to max_duration + max_utts (int): limit number of utterances + normalize_transcripts (bool): normalize transcript text + sort_by_duration (bool): sort sequences by increasing duration + trim_silence (bool): trim leading and trailing silence from audio + ignore_offline_speed_perturbation (bool): use precomputed speed perturbation + + Returns: + tuple of Tensors + """ + self.data_dir = data_dir + self.labels = labels + self.labels_map = dict([(labels[i], i) for i in range(len(labels))]) + self.punctuation_map = punctuation_map(labels) + self.blank_index = len(labels) + + self.pad_to_max_duration = pad_to_max_duration + + self.sort_by_duration = sort_by_duration + self.max_utts = max_utts + self.normalize_transcripts = normalize_transcripts + self.ignore_offline_speed_perturbation = ignore_offline_speed_perturbation + + self.min_duration = min_duration + self.max_duration = max_duration + self.trim_silence = trim_silence + self.sample_rate = sample_rate + + perturbations = [] + if speed_perturbation is not None: + perturbations.append(SpeedPerturbation(**speed_perturbation)) + if gain_perturbation is not None: + perturbations.append(GainPerturbation(**gain_perturbation)) + if shift_perturbation is not None: + perturbations.append(ShiftPerturbation(**shift_perturbation)) + self.perturbations = perturbations + + self.max_duration = max_duration + + self.samples = [] + self.duration = 0.0 + self.duration_filtered = 0.0 + + for fpath in manifest_fpaths: + self._load_json_manifest(fpath) + + if sort_by_duration: + self.samples = sorted(self.samples, key=lambda s: s['duration']) + + def __getitem__(self, index): + s = self.samples[index] + rn_indx = np.random.randint(len(s['audio_filepath'])) + duration = s['audio_duration'][rn_indx] if 'audio_duration' in s else 0 + offset = s.get('offset', 0) + + segment = AudioSegment( + s['audio_filepath'][rn_indx], target_sr=self.sample_rate, + offset=offset, duration=duration, trim=self.trim_silence) + + for p in self.perturbations: + p.maybe_apply(segment, self.sample_rate) + + segment = torch.FloatTensor(segment.samples) + + return (segment, + torch.tensor(segment.shape[0]).int(), + torch.tensor(s["transcript"]), + torch.tensor(len(s["transcript"])).int()) + + def __len__(self): + return len(self.samples) + + def _load_json_manifest(self, fpath): + for s in json.load(open(fpath, "r", encoding="utf-8")): + + if self.pad_to_max_duration and not self.ignore_offline_speed_perturbation: + # require all perturbed samples to be < self.max_duration + s_max_duration = max(f['duration'] for f in s['files']) + else: + # otherwise we allow perturbances to be > self.max_duration + s_max_duration = s['original_duration'] + + s['duration'] = s.pop('original_duration') + if not (self.min_duration <= s_max_duration <= self.max_duration): + self.duration_filtered += s['duration'] + continue + + # Prune and normalize according to transcript + tr = (s.get('transcript', None) or + self.load_transcript(s['text_filepath'])) + + if not isinstance(tr, str): + print(f'WARNING: Skipped sample (transcript not a str): {tr}.') + self.duration_filtered += s['duration'] + continue + + if self.normalize_transcripts: + tr = normalize_string(tr, self.labels, self.punctuation_map) + + s["transcript"] = self.to_vocab_inds(tr) + + files = s.pop('files') + if self.ignore_offline_speed_perturbation: + files = [f for f in files if f['speed'] == 1.0] + + s['audio_duration'] = [f['duration'] for f in files] + s['audio_filepath'] = [str(Path(self.data_dir, f['fname'])) + for f in files] + self.samples.append(s) + self.duration += s['duration'] + + if self.max_utts > 0 and len(self.samples) >= self.max_utts: + print(f'Reached max_utts={self.max_utts}. Finished parsing {fpath}.') + break + + def load_transcript(self, transcript_path): + with open(transcript_path, 'r', encoding="utf-8") as transcript_file: + transcript = transcript_file.read().replace('\n', '') + return transcript + + def to_vocab_inds(self, transcript): + chars = [self.labels_map.get(x, self.blank_index) for x in list(transcript)] + transcript = list(filter(lambda x: x != self.blank_index, chars)) + return transcript + + +def collate_fn(batch): + bs = len(batch) + max_len = lambda l, idx: max(el[idx].size(0) for el in l) + audio = torch.zeros(bs, max_len(batch, 0)) + audio_lens = torch.zeros(bs, dtype=torch.int32) + transcript = torch.zeros(bs, max_len(batch, 2)) + transcript_lens = torch.zeros(bs, dtype=torch.int32) + + for i, sample in enumerate(batch): + audio[i].narrow(0, 0, sample[0].size(0)).copy_(sample[0]) + audio_lens[i] = sample[1] + transcript[i].narrow(0, 0, sample[2].size(0)).copy_(sample[2]) + transcript_lens[i] = sample[3] + return audio, audio_lens, transcript, transcript_lens + + +def get_data_loader(dataset, batch_size, multi_gpu=True, shuffle=True, + drop_last=True, num_workers=4): + + kw = {'dataset': dataset, 'collate_fn': collate_fn, + 'num_workers': num_workers, 'pin_memory': True} + + if multi_gpu: + loader_shuffle = False + sampler = DistributedSampler(dataset, shuffle=shuffle) + else: + loader_shuffle = shuffle + sampler = None + + return DataLoader(batch_size=batch_size, drop_last=drop_last, + sampler=sampler, shuffle=loader_shuffle, **kw) diff --git a/PyTorch/SpeechRecognition/Jasper/common/features.py b/PyTorch/SpeechRecognition/Jasper/common/features.py new file mode 100644 index 00000000..b2ef126c --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/features.py @@ -0,0 +1,293 @@ +import math +import random + +import librosa +import torch +import torch.nn as nn + +from apex import amp + + +class BaseFeatures(nn.Module): + """Base class for GPU accelerated audio preprocessing.""" + __constants__ = ["pad_align", "pad_to_max_duration", "max_len"] + + def __init__(self, pad_align, pad_to_max_duration, max_duration, + sample_rate, window_size, window_stride, spec_augment=None, + cutout_augment=None): + super(BaseFeatures, self).__init__() + + self.pad_align = pad_align + self.pad_to_max_duration = pad_to_max_duration + self.win_length = int(sample_rate * window_size) # frame size + self.hop_length = int(sample_rate * window_stride) + + # Calculate maximum sequence length (# frames) + if pad_to_max_duration: + self.max_len = 1 + math.ceil( + (max_duration * sample_rate - self.win_length) / self.hop_length + ) + + if spec_augment is not None: + self.spec_augment = SpecAugment(**spec_augment) + else: + self.spec_augment = None + + if cutout_augment is not None: + self.cutout_augment = CutoutAugment(**cutout_augment) + else: + self.cutout_augment = None + + @torch.no_grad() + def calculate_features(self, audio, audio_lens): + return audio, audio_lens + + def __call__(self, audio, audio_lens, optim_level=0): + dtype = audio.dtype + audio = audio.float() + if optim_level == 1: + with amp.disable_casts(): + feat, feat_lens = self.calculate_features(audio, audio_lens) + else: + feat, feat_lens = self.calculate_features(audio, audio_lens) + + feat = self.apply_padding(feat) + + if self.cutout_augment is not None: + feat = self.cutout_augment(feat) + + if self.spec_augment is not None: + feat = self.spec_augment(feat) + + feat = feat.to(dtype) + return feat, feat_lens + + def apply_padding(self, x): + if self.pad_to_max_duration: + x_size = max(x.size(-1), self.max_len) + else: + x_size = x.size(-1) + + if self.pad_align > 0: + pad_amt = x_size % self.pad_align + else: + pad_amt = 0 + + padded_len = x_size + (self.pad_align - pad_amt if pad_amt > 0 else 0) + return nn.functional.pad(x, (0, padded_len - x.size(-1))) + + +class SpecAugment(nn.Module): + """Spec augment. refer to https://arxiv.org/abs/1904.08779 + """ + def __init__(self, freq_masks=0, min_freq=0, max_freq=10, time_masks=0, + min_time=0, max_time=10): + super(SpecAugment, self).__init__() + assert 0 <= min_freq <= max_freq + assert 0 <= min_time <= max_time + + self.freq_masks = freq_masks + self.min_freq = min_freq + self.max_freq = max_freq + + self.time_masks = time_masks + self.min_time = min_time + self.max_time = max_time + + @torch.no_grad() + def forward(self, x): + sh = x.shape + mask = torch.zeros(x.shape, dtype=torch.bool, device=x.device) + + for idx in range(sh[0]): + for _ in range(self.freq_masks): + w = torch.randint(self.min_freq, self.max_freq + 1, size=(1,)).item() + f0 = torch.randint(0, max(1, sh[1] - w), size=(1,)) + mask[idx, f0:f0+w] = 1 + + for _ in range(self.time_masks): + w = torch.randint(self.min_time, self.max_time + 1, size=(1,)).item() + t0 = torch.randint(0, max(1, sh[2] - w), size=(1,)) + mask[idx, :, t0:t0+w] = 1 + + return x.masked_fill(mask, 0) + + +class CutoutAugment(nn.Module): + """Cutout. refer to https://arxiv.org/pdf/1708.04552.pdf + """ + def __init__(self, masks=0, min_freq=20, max_freq=20, min_time=5, max_time=5): + super(CutoutAugment, self).__init__() + assert 0 <= min_freq <= max_freq + assert 0 <= min_time <= max_time + + self.masks = masks + self.min_freq = min_freq + self.max_freq = max_freq + self.min_time = min_time + self.max_time = max_time + + @torch.no_grad() + def forward(self, x): + sh = x.shape + mask = torch.zeros(x.shape, dtype=torch.bool, device=x.device) + + for idx in range(sh[0]): + for i in range(self.masks): + + w = torch.randint(self.min_freq, self.max_freq + 1, size=(1,)).item() + h = torch.randint(self.min_time, self.max_time + 1, size=(1,)).item() + + f0 = int(random.uniform(0, sh[1] - w)) + t0 = int(random.uniform(0, sh[2] - h)) + + mask[idx, f0:f0+w, t0:t0+h] = 1 + + return x.masked_fill(mask, 0) + + +@torch.jit.script +def normalize_batch(x, seq_len, normalize_type: str): +# print ("normalize_batch: x, seq_len, shapes: ", x.shape, seq_len, seq_len.shape) + if normalize_type == "per_feature": + x_mean = torch.zeros((seq_len.shape[0], x.shape[1]), dtype=x.dtype, + device=x.device) + x_std = torch.zeros((seq_len.shape[0], x.shape[1]), dtype=x.dtype, + device=x.device) + for i in range(x.shape[0]): + x_mean[i, :] = x[i, :, :seq_len[i]].mean(dim=1) + x_std[i, :] = x[i, :, :seq_len[i]].std(dim=1) + # make sure x_std is not zero + x_std += 1e-5 + return (x - x_mean.unsqueeze(2)) / x_std.unsqueeze(2) + + elif normalize_type == "all_features": + x_mean = torch.zeros(seq_len.shape, dtype=x.dtype, device=x.device) + x_std = torch.zeros(seq_len.shape, dtype=x.dtype, device=x.device) + for i in range(x.shape[0]): + x_mean[i] = x[i, :, :int(seq_len[i])].mean() + x_std[i] = x[i, :, :int(seq_len[i])].std() + # make sure x_std is not zero + x_std += 1e-5 + return (x - x_mean.view(-1, 1, 1)) / x_std.view(-1, 1, 1) + else: + return x + + +@torch.jit.script +def splice_frames(x, frame_splicing: int): + """ Stacks frames together across feature dim + + input is batch_size, feature_dim, num_frames + output is batch_size, feature_dim*frame_splicing, num_frames + + """ + seq = [x] + # TORCHSCRIPT: JIT doesnt like range(start, stop) + for n in range(frame_splicing - 1): + seq.append(torch.cat([x[:, :, :n + 1], x[:, :, n + 1:]], dim=2)) + return torch.cat(seq, dim=1) + + +class FilterbankFeatures(BaseFeatures): + # For JIT, https://pytorch.org/docs/stable/jit.html#python-defined-constants + __constants__ = ["dither", "preemph", "n_fft", "hop_length", "win_length", + "log", "frame_splicing", "normalize"] + # torchscript: "center" removed due to a bug + + def __init__(self, spec_augment=None, cutout_augment=None, + sample_rate=8000, window_size=0.02, window_stride=0.01, + window="hamming", normalize="per_feature", n_fft=None, + preemph=0.97, n_filt=64, lowfreq=0, highfreq=None, log=True, + dither=1e-5, pad_align=8, pad_to_max_duration=False, + max_duration=float('inf'), frame_splicing=1): + super(FilterbankFeatures, self).__init__( + pad_align=pad_align, pad_to_max_duration=pad_to_max_duration, + max_duration=max_duration, sample_rate=sample_rate, + window_size=window_size, window_stride=window_stride, + spec_augment=spec_augment, cutout_augment=cutout_augment) + + torch_windows = { + 'hann': torch.hann_window, + 'hamming': torch.hamming_window, + 'blackman': torch.blackman_window, + 'bartlett': torch.bartlett_window, + 'none': None, + } + + self.n_fft = n_fft or 2 ** math.ceil(math.log2(self.win_length)) + + self.normalize = normalize + self.log = log + #TORCHSCRIPT: Check whether or not we need this + self.dither = dither + self.frame_splicing = frame_splicing + self.n_filt = n_filt + self.preemph = preemph + highfreq = highfreq or sample_rate / 2 + window_fn = torch_windows.get(window, None) + window_tensor = window_fn(self.win_length, + periodic=False) if window_fn else None + filterbanks = torch.tensor( + librosa.filters.mel(sample_rate, self.n_fft, n_mels=n_filt, + fmin=lowfreq, fmax=highfreq), + dtype=torch.float).unsqueeze(0) + # torchscript + self.register_buffer("fb", filterbanks) + self.register_buffer("window", window_tensor) + + def get_seq_len(self, seq_len): + return torch.ceil(seq_len.to(dtype=torch.float) / self.hop_length).to( + dtype=torch.int) + + # do stft + # TORCHSCRIPT: center removed due to bug + def stft(self, x): + return torch.stft(x, n_fft=self.n_fft, hop_length=self.hop_length, + win_length=self.win_length, + window=self.window.to(dtype=torch.float)) + + @torch.no_grad() + def calculate_features(self, x, seq_len): + dtype = x.dtype + + seq_len = self.get_seq_len(seq_len) + + # dither + if self.dither > 0: + x += self.dither * torch.randn_like(x) + + # do preemphasis + if self.preemph is not None: + x = torch.cat( + (x[:, 0].unsqueeze(1), x[:, 1:] - self.preemph * x[:, :-1]), dim=1) + x = self.stft(x) + + # get power spectrum + x = x.pow(2).sum(-1) + + # dot with filterbank energies + x = torch.matmul(self.fb.to(x.dtype), x) + + # log features if required + if self.log: + x = torch.log(x + 1e-20) + + # frame splicing if required + if self.frame_splicing > 1: + raise ValueError('Frame splicing not supported') + + # normalize if required + x = normalize_batch(x, seq_len, normalize_type=self.normalize) + + # mask to zero any values beyond seq_len in batch, + # pad to multiple of `pad_align` (for efficiency) + max_len = x.size(-1) + mask = torch.arange(max_len, dtype=seq_len.dtype, device=x.device) + mask = mask.expand(x.size(0), max_len) >= seq_len.unsqueeze(1) + x = x.masked_fill(mask.unsqueeze(1), 0) + + # TORCHSCRIPT: Is this del important? It breaks scripting + # del mask + + return x.to(dtype), seq_len diff --git a/PyTorch/SpeechRecognition/Jasper/common/helpers.py b/PyTorch/SpeechRecognition/Jasper/common/helpers.py new file mode 100644 index 00000000..742f1592 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/helpers.py @@ -0,0 +1,300 @@ +# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import glob +import os +import re +from collections import OrderedDict + +from apex import amp + +import torch +import torch.distributed as dist + +from .metrics import word_error_rate + + +def print_once(msg): + if not dist.is_initialized() or dist.get_rank() == 0: + print(msg) + + +def add_ctc_blank(symbols): + return symbols + [''] + + +def ctc_decoder_predictions_tensor(tensor, labels): + """ + Takes output of greedy ctc decoder and performs ctc decoding algorithm to + remove duplicates and special symbol. Returns prediction + Args: + tensor: model output tensor + label: A list of labels + Returns: + prediction + """ + blank_id = len(labels) - 1 + hypotheses = [] + labels_map = {i: labels[i] for i in range(len(labels))} + prediction_cpu_tensor = tensor.long().cpu() + # iterate over batch + for ind in range(prediction_cpu_tensor.shape[0]): + prediction = prediction_cpu_tensor[ind].numpy().tolist() + # CTC decoding procedure + decoded_prediction = [] + previous = len(labels) - 1 # id of a blank symbol + for p in prediction: + if (p != previous or previous == blank_id) and p != blank_id: + decoded_prediction.append(p) + previous = p + hypothesis = ''.join([labels_map[c] for c in decoded_prediction]) + hypotheses.append(hypothesis) + return hypotheses + + +def greedy_wer(preds, tgt, tgt_lens, labels): + """ + Takes output of greedy ctc decoder and performs ctc decoding algorithm to + remove duplicates and special symbol. Prints wer and prediction examples to screen + Args: + tensors: A list of 3 tensors (predictions, targets, target_lengths) + labels: A list of labels + + Returns: + word error rate + """ + with torch.no_grad(): + references = gather_transcripts([tgt], [tgt_lens], labels) + hypotheses = ctc_decoder_predictions_tensor(preds, labels) + + wer, _, _ = word_error_rate(hypotheses, references) + return wer, hypotheses[0], references[0] + + +def gather_losses(losses_list): + return [torch.mean(torch.stack(losses_list))] + + +def gather_predictions(predictions_list, labels): + results = [] + for prediction in predictions_list: + results += ctc_decoder_predictions_tensor(prediction, labels=labels) + return results + + +def gather_transcripts(transcript_list, transcript_len_list, labels): + results = [] + labels_map = {i: labels[i] for i in range(len(labels))} + # iterate over workers + for txt, lens in zip(transcript_list, transcript_len_list): + for t, l in zip(txt.long().cpu(), lens.long().cpu()): + t = list(t.numpy()) + results.append(''.join([labels_map[c] for c in t[:l]])) + return results + + +def process_evaluation_batch(tensors, global_vars, labels): + """ + Processes results of an iteration and saves it in global_vars + Args: + tensors: dictionary with results of an evaluation iteration, e.g. loss, predictions, transcript, and output + global_vars: dictionary where processes results of iteration are saved + labels: A list of labels + """ + for kv, v in tensors.items(): + if kv.startswith('loss'): + global_vars['EvalLoss'] += gather_losses(v) + elif kv.startswith('predictions'): + global_vars['preds'] += gather_predictions(v, labels) + elif kv.startswith('transcript_length'): + transcript_len_list = v + elif kv.startswith('transcript'): + transcript_list = v + elif kv.startswith('output'): + global_vars['logits'] += v + + global_vars['txts'] += gather_transcripts( + transcript_list, transcript_len_list, labels) + + +def process_evaluation_epoch(aggregates, tag=None): + """ + Processes results from each worker at the end of evaluation and combine to final result + Args: + aggregates: dictionary containing information of entire evaluation + Return: + wer: final word error rate + loss: final loss + """ + if 'losses' in aggregates: + eloss = torch.mean(torch.stack(aggregates['losses'])).item() + else: + eloss = None + hypotheses = aggregates['preds'] + references = aggregates['txts'] + + wer, scores, num_words = word_error_rate(hypotheses, references) + multi_gpu = dist.is_initialized() + if multi_gpu: + if eloss is not None: + eloss /= dist.get_world_size() + eloss_tensor = torch.tensor(eloss).cuda() + dist.all_reduce(eloss_tensor) + eloss = eloss_tensor.item() + + scores_tensor = torch.tensor(scores).cuda() + dist.all_reduce(scores_tensor) + scores = scores_tensor.item() + num_words_tensor = torch.tensor(num_words).cuda() + dist.all_reduce(num_words_tensor) + num_words = num_words_tensor.item() + wer = scores * 1.0 / num_words + return wer, eloss + + +def num_weights(module): + return sum(p.numel() for p in module.parameters() if p.requires_grad) + + +def convert_v1_state_dict(state_dict): + rules = [ + ('^jasper_encoder.encoder.', 'encoder.layers.'), + ('^jasper_decoder.decoder_layers.', 'decoder.layers.'), + ] + ret = {} + for k, v in state_dict.items(): + if k.startswith('acoustic_model.'): + continue + if k.startswith('audio_preprocessor.'): + continue + for pattern, to in rules: + k = re.sub(pattern, to, k) + ret[k] = v + + return ret + + +class Checkpointer(object): + + def __init__(self, save_dir, model_name, keep_milestones=[100,200,300], + use_amp=False): + self.save_dir = save_dir + self.keep_milestones = keep_milestones + self.use_amp = use_amp + self.model_name = model_name + + tracked = [ + (int(re.search('epoch(\d+)_', f).group(1)), f) + for f in glob.glob(f'{save_dir}/{self.model_name}_epoch*_checkpoint.pt')] + tracked = sorted(tracked, key=lambda t: t[0]) + self.tracked = OrderedDict(tracked) + + def save(self, model, ema_model, optimizer, epoch, step, best_wer, + is_best=False): + """Saves model checkpoint for inference/resuming training. + + Args: + model: the model, optionally wrapped by DistributedDataParallel + ema_model: model with averaged weights, can be None + optimizer: optimizer + epoch (int): epoch during which the model is saved + step (int): number of steps since beginning of training + best_wer (float): lowest recorded WER on the dev set + is_best (bool, optional): set name of checkpoint to 'best' + and overwrite the previous one + """ + rank = 0 + if dist.is_initialized(): + dist.barrier() + rank = dist.get_rank() + + if rank != 0: + return + + # Checkpoint already saved + if not is_best and epoch in self.tracked: + return + + unwrap_ddp = lambda model: getattr(model, 'module', model) + state = { + 'epoch': epoch, + 'step': step, + 'best_wer': best_wer, + 'state_dict': unwrap_ddp(model).state_dict(), + 'ema_state_dict': unwrap_ddp(ema_model).state_dict() if ema_model is not None else None, + 'optimizer': optimizer.state_dict(), + 'amp': amp.state_dict() if self.use_amp else None, + } + + if is_best: + fpath = os.path.join( + self.save_dir, f"{self.model_name}_best_checkpoint.pt") + else: + fpath = os.path.join( + self.save_dir, f"{self.model_name}_epoch{epoch}_checkpoint.pt") + + print_once(f"Saving {fpath}...") + torch.save(state, fpath) + + if not is_best: + # Remove old checkpoints; keep milestones and the last two + self.tracked[epoch] = fpath + for epoch in set(list(self.tracked)[:-2]) - set(self.keep_milestones): + try: + os.remove(self.tracked[epoch]) + except: + pass + del self.tracked[epoch] + + def last_checkpoint(self): + tracked = list(self.tracked.values()) + + if len(tracked) >= 1: + try: + torch.load(tracked[-1], map_location='cpu') + return tracked[-1] + except: + print_once(f'Last checkpoint {tracked[-1]} appears corrupted.') + + elif len(tracked) >= 2: + return tracked[-2] + else: + return None + + def load(self, fpath, model, ema_model, optimizer, meta): + + print_once(f'Loading model from {fpath}') + checkpoint = torch.load(fpath, map_location="cpu") + + unwrap_ddp = lambda model: getattr(model, 'module', model) + state_dict = convert_v1_state_dict(checkpoint['state_dict']) + unwrap_ddp(model).load_state_dict(state_dict, strict=True) + + if ema_model is not None: + if checkpoint.get('ema_state_dict') is not None: + key = 'ema_state_dict' + else: + key = 'state_dict' + print_once('WARNING: EMA weights not found in the checkpoint.') + print_once('WARNING: Initializing EMA model with regular params.') + state_dict = convert_v1_state_dict(checkpoint[key]) + unwrap_ddp(ema_model).load_state_dict(state_dict, strict=True) + + optimizer.load_state_dict(checkpoint['optimizer']) + + if self.use_amp: + amp.load_state_dict(checkpoint['amp']) + + meta['start_epoch'] = checkpoint.get('epoch') + meta['best_wer'] = checkpoint.get('best_wer', meta['best_wer']) diff --git a/PyTorch/SpeechRecognition/Jasper/metrics.py b/PyTorch/SpeechRecognition/Jasper/common/metrics.py similarity index 64% rename from PyTorch/SpeechRecognition/Jasper/metrics.py rename to PyTorch/SpeechRecognition/Jasper/common/metrics.py index c50e3b5f..4ae47a4c 100644 --- a/PyTorch/SpeechRecognition/Jasper/metrics.py +++ b/PyTorch/SpeechRecognition/Jasper/common/metrics.py @@ -12,12 +12,10 @@ # See the License for the specific language governing permissions and # limitations under the License. -from typing import List +def __levenshtein(a, b): + """Calculates the Levenshtein distance between two sequences.""" -def __levenshtein(a: List, b: List) -> int: - """Calculates the Levenshtein distance between a and b. - """ n, m = len(a), len(b) if n > m: # Make sure n <= m, to use O(min(n,m)) space @@ -37,28 +35,18 @@ def __levenshtein(a: List, b: List) -> int: return current[n] -def word_error_rate(hypotheses: List[str], references: List[str]) -> float: - """ - Computes Average Word Error rate between two texts represented as - corresponding lists of string. Hypotheses and references must have same length. +def word_error_rate(hypotheses, references): + """Computes average Word Error Rate (WER) between two text lists.""" - Args: - hypotheses: list of hypotheses - references: list of references - - Returns: - (float) average word error rate - """ scores = 0 words = 0 - len_diff = len(references) - len(hypotheses) + len_diff = len(references) - len(hypotheses) if len_diff > 0: - raise ValueError("In word error rate calculation, hypotheses and reference" - " lists must have the same number of elements. But I got:" - "{0} and {1} correspondingly".format(len(hypotheses), len(references))) + raise ValueError("Uneqal number of hypthoses and references: " + "{0} and {1}".format(len(hypotheses), len(references))) elif len_diff < 0: hypotheses = hypotheses[:len_diff] - + for h, r in zip(hypotheses, references): h_list = h.split() r_list = r.split() diff --git a/PyTorch/SpeechRecognition/Jasper/optimizers.py b/PyTorch/SpeechRecognition/Jasper/common/optimizers.py similarity index 87% rename from PyTorch/SpeechRecognition/Jasper/optimizers.py rename to PyTorch/SpeechRecognition/Jasper/common/optimizers.py index a89adcd1..81759191 100644 --- a/PyTorch/SpeechRecognition/Jasper/optimizers.py +++ b/PyTorch/SpeechRecognition/Jasper/common/optimizers.py @@ -16,6 +16,51 @@ import torch from torch.optim import Optimizer import math + +def lr_policy(step, epoch, initial_lr, optimizer, steps_per_epoch, warmup_epochs, + hold_epochs, num_epochs=None, policy='linear', min_lr=1e-5, + exp_gamma=None): + """ + learning rate decay + Args: + initial_lr: base learning rate + step: current iteration number + N: total number of iterations over which learning rate is decayed + lr_steps: list of steps to apply exp_gamma + """ + warmup_steps = warmup_epochs * steps_per_epoch + hold_steps = hold_epochs * steps_per_epoch + + if policy == 'legacy': + assert num_epochs is not None + tot_steps = num_epochs * steps_per_epoch + + if step < warmup_steps: + a = (step + 1) / (warmup_steps + 1) + elif step < warmup_steps + hold_steps: + a = 1.0 + else: + a = (((tot_steps - step) + / (tot_steps - warmup_steps - hold_steps)) ** 2) + + elif policy == 'exponential': + assert exp_gamma is not None + + if step < warmup_steps: + a = (step + 1) / (warmup_steps + 1) + elif step < warmup_steps + hold_steps: + a = 1.0 + else: + a = exp_gamma ** (epoch - warmup_epochs - hold_epochs) + + else: + raise ValueError + + new_lr = max(a * initial_lr, min_lr) + for param_group in optimizer.param_groups: + param_group['lr'] = new_lr + + class AdamW(Optimizer): """Implements AdamW algorithm. @@ -114,6 +159,7 @@ class AdamW(Optimizer): p.data.add_(torch.mul(p.data, group['weight_decay']).addcdiv_(1, exp_avg, denom), alpha=-step_size) return loss + class Novograd(Optimizer): """ diff --git a/PyTorch/SpeechRecognition/Jasper/common/tb_dllogger.py b/PyTorch/SpeechRecognition/Jasper/common/tb_dllogger.py new file mode 100644 index 00000000..ecc6ec86 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/tb_dllogger.py @@ -0,0 +1,159 @@ +import atexit +import glob +import os +import re +import numpy as np + +import torch +from torch.utils.tensorboard import SummaryWriter + +import dllogger +from dllogger import StdOutBackend, JSONStreamBackend, Verbosity + + +tb_loggers = {} + + +class TBLogger: + """ + xyz_dummies: stretch the screen with empty plots so the legend would + always fit for other plots + """ + def __init__(self, enabled, log_dir, name, interval=1, dummies=True): + self.enabled = enabled + self.interval = interval + self.cache = {} + if self.enabled: + self.summary_writer = SummaryWriter( + log_dir=os.path.join(log_dir, name), + flush_secs=120, max_queue=200) + atexit.register(self.summary_writer.close) + if dummies: + for key in ('aaa', 'zzz'): + self.summary_writer.add_scalar(key, 0.0, 1) + + def log(self, step, data): + for k, v in data.items(): + self.log_value(step, k, v.item() if type(v) is torch.Tensor else v) + + def log_value(self, step, key, val, stat='mean'): + if self.enabled: + if key not in self.cache: + self.cache[key] = [] + self.cache[key].append(val) + if len(self.cache[key]) == self.interval: + agg_val = getattr(np, stat)(self.cache[key]) + self.summary_writer.add_scalar(key, agg_val, step) + del self.cache[key] + + def log_grads(self, step, model): + if self.enabled: + norms = [p.grad.norm().item() for p in model.parameters() + if p.grad is not None] + for stat in ('max', 'min', 'mean'): + self.log_value(step, f'grad_{stat}', getattr(np, stat)(norms), + stat=stat) + + +def unique_log_fpath(log_fpath): + + if not os.path.isfile(log_fpath): + return log_fpath + + # Avoid overwriting old logs + saved = sorted([int(re.search('\.(\d+)', f).group(1)) + for f in glob.glob(f'{log_fpath}.*')]) + + log_num = (saved[-1] if saved else 0) + 1 + return f'{log_fpath}.{log_num}' + + +def stdout_step_format(step): + if isinstance(step, str): + return step + fields = [] + if len(step) > 0: + fields.append("epoch {:>4}".format(step[0])) + if len(step) > 1: + fields.append("iter {:>4}".format(step[1])) + if len(step) > 2: + fields[-1] += "/{}".format(step[2]) + return " | ".join(fields) + + +def stdout_metric_format(metric, metadata, value): + name = metadata.get("name", metric + " : ") + unit = metadata.get("unit", None) + format = f'{{{metadata.get("format", "")}}}' + fields = [name, format.format(value) if value is not None else value, unit] + fields = [f for f in fields if f is not None] + return "| " + " ".join(fields) + + +def init_log(args): + enabled = (args.local_rank == 0) + if enabled: + fpath = args.log_file or os.path.join(args.output_dir, 'nvlog.json') + backends = [JSONStreamBackend(Verbosity.DEFAULT, + unique_log_fpath(fpath)), + StdOutBackend(Verbosity.VERBOSE, + step_format=stdout_step_format, + metric_format=stdout_metric_format)] + else: + backends = [] + + dllogger.init(backends=backends) + dllogger.metadata("train_lrate", {"name": "lrate", "format": ":>3.2e"}) + + for id_, pref in [('train', ''), ('train_avg', 'avg train '), + ('dev', ' avg dev '), ('dev_ema', ' EMA dev ')]: + + dllogger.metadata(f"{id_}_loss", + {"name": f"{pref}loss", "format": ":>7.2f"}) + + dllogger.metadata(f"{id_}_wer", + {"name": f"{pref}wer", "format": ":>6.2f"}) + + dllogger.metadata(f"{id_}_throughput", + {"name": f"{pref}utts/s", "format": ":>5.0f"}) + + dllogger.metadata(f"{id_}_took", + {"name": "took", "unit": "s", "format": ":>5.2f"}) + + tb_subsets = ['train', 'dev', 'dev_ema'] if args.ema else ['train', 'dev'] + global tb_loggers + tb_loggers = {s: TBLogger(enabled, args.output_dir, name=s) + for s in tb_subsets} + + log_parameters(vars(args), tb_subset='train') + + +def log(step, tb_total_steps=None, subset='train', data={}): + + if tb_total_steps is not None: + tb_loggers[subset].log(tb_total_steps, data) + + if subset != '': + data = {f'{subset}_{key}': v for key,v in data.items()} + dllogger.log(step, data=data) + + +def log_grads_tb(tb_total_steps, grads, tb_subset='train'): + tb_loggers[tb_subset].log_grads(tb_total_steps, grads) + + +def log_parameters(data, verbosity=0, tb_subset=None): + for k,v in data.items(): + dllogger.log(step="PARAMETER", data={k:v}, verbosity=verbosity) + + if tb_subset is not None and tb_loggers[tb_subset].enabled: + tb_data = {k:v for k,v in data.items() + if type(v) in (str, bool, int, float)} + tb_loggers[tb_subset].summary_writer.add_hparams(tb_data, {}) + + +def flush_log(): + dllogger.flush() + for tbl in tb_loggers.values(): + if tbl.enabled: + tbl.summary_writer.flush() diff --git a/PyTorch/SpeechRecognition/Jasper/parts/text/LICENSE b/PyTorch/SpeechRecognition/Jasper/common/text/LICENSE similarity index 100% rename from PyTorch/SpeechRecognition/Jasper/parts/text/LICENSE rename to PyTorch/SpeechRecognition/Jasper/common/text/LICENSE diff --git a/PyTorch/SpeechRecognition/Jasper/common/text/__init__.py b/PyTorch/SpeechRecognition/Jasper/common/text/__init__.py new file mode 100644 index 00000000..49018238 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/common/text/__init__.py @@ -0,0 +1,32 @@ +# Copyright (c) 2017 Keith Ito +""" from https://github.com/keithito/tacotron """ +import re +import string +from . import cleaners + +def _clean_text(text, cleaner_names, *args): + for name in cleaner_names: + cleaner = getattr(cleaners, name) + if not cleaner: + raise Exception('Unknown cleaner: %s' % name) + text = cleaner(text, *args) + return text + + +def punctuation_map(labels): + # Punctuation to remove + punctuation = string.punctuation + punctuation = punctuation.replace("+", "") + punctuation = punctuation.replace("&", "") + # TODO We might also want to consider: + # @ -> at + # # -> number, pound, hashtag + # ~ -> tilde + # _ -> underscore + # % -> percent + # If a punctuation symbol is inside our vocab, we do not remove from text + for l in labels: + punctuation = punctuation.replace(l, "") + # Turn all punctuation to whitespace + table = str.maketrans(punctuation, " " * len(punctuation)) + return table diff --git a/PyTorch/SpeechRecognition/Jasper/parts/text/cleaners.py b/PyTorch/SpeechRecognition/Jasper/common/text/cleaners.py similarity index 100% rename from PyTorch/SpeechRecognition/Jasper/parts/text/cleaners.py rename to PyTorch/SpeechRecognition/Jasper/common/text/cleaners.py diff --git a/PyTorch/SpeechRecognition/Jasper/parts/text/numbers.py b/PyTorch/SpeechRecognition/Jasper/common/text/numbers.py similarity index 100% rename from PyTorch/SpeechRecognition/Jasper/parts/text/numbers.py rename to PyTorch/SpeechRecognition/Jasper/common/text/numbers.py diff --git a/PyTorch/SpeechRecognition/Jasper/parts/text/symbols.py b/PyTorch/SpeechRecognition/Jasper/common/text/symbols.py similarity index 100% rename from PyTorch/SpeechRecognition/Jasper/parts/text/symbols.py rename to PyTorch/SpeechRecognition/Jasper/common/text/symbols.py diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5_sp_offline.toml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5_sp_offline.toml deleted file mode 100644 index db37ca77..00000000 --- a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5_sp_offline.toml +++ /dev/null @@ -1,194 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - - - -model = "Jasper" - -[input] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 -max_duration = 16.7 -speed_perturbation = true - - -cutout_rect_regions = 0 -cutout_rect_time = 60 -cutout_rect_freq = 25 - - -cutout_x_regions = 0 -cutout_y_regions = 0 -cutout_x_width = 6 -cutout_y_width = 6 - -[input_eval] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 - - -[encoder] -activation = "relu" -convmask = true - -[[jasper]] -filters = 256 -repeat = 1 -kernel = [11] -stride = [2] -dilation = [1] -dropout = 0.2 -residual = false - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true - - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true - - -[[jasper]] -filters = 896 -repeat = 1 -kernel = [29] -stride = [1] -dilation = [2] -dropout = 0.4 -residual = false - -[[jasper]] -filters = 1024 -repeat = 1 -kernel = [1] -stride = [1] -dilation = [1] -dropout = 0.4 -residual = false - -[labels] -labels = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr.toml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr.toml deleted file mode 100644 index 088cc426..00000000 --- a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr.toml +++ /dev/null @@ -1,203 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -model = "Jasper" - -[input] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 -max_duration = 16.7 -speed_perturbation = false - - -cutout_rect_regions = 0 -cutout_rect_time = 60 -cutout_rect_freq = 25 - -cutout_x_regions = 0 -cutout_y_regions = 0 -cutout_x_width = 6 -cutout_y_width = 6 - - -[input_eval] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 - - -[encoder] -activation = "relu" -convmask = true - -[[jasper]] -filters = 256 -repeat = 1 -kernel = [11] -stride = [2] -dilation = [1] -dropout = 0.2 -residual = false - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 896 -repeat = 1 -kernel = [29] -stride = [1] -dilation = [2] -dropout = 0.4 -residual = false - -[[jasper]] -filters = 1024 -repeat = 1 -kernel = [1] -stride = [1] -dilation = [1] -dropout = 0.4 -residual = false - -[labels] -labels = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_nomask.toml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_nomask.toml deleted file mode 100644 index d532543c..00000000 --- a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_nomask.toml +++ /dev/null @@ -1,203 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -model = "Jasper" - -[input] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 -max_duration = 16.7 -speed_perturbation = false - - -cutout_rect_regions = 0 -cutout_rect_time = 60 -cutout_rect_freq = 25 - -cutout_x_regions = 0 -cutout_y_regions = 0 -cutout_x_width = 6 -cutout_y_width = 6 - - -[input_eval] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 - - -[encoder] -activation = "relu" -convmask = false - -[[jasper]] -filters = 256 -repeat = 1 -kernel = [11] -stride = [2] -dilation = [1] -dropout = 0.2 -residual = false - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 896 -repeat = 1 -kernel = [29] -stride = [1] -dilation = [2] -dropout = 0.4 -residual = false - -[[jasper]] -filters = 1024 -repeat = 1 -kernel = [1] -stride = [1] -dilation = [1] -dropout = 0.4 -residual = false - -[labels] -labels = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_sp_offline.toml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_sp_offline.toml deleted file mode 100644 index bade525c..00000000 --- a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_sp_offline.toml +++ /dev/null @@ -1,204 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -model = "Jasper" - -[input] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 -max_duration = 16.7 -speed_perturbation = true - - -cutout_rect_regions = 0 -cutout_rect_time = 60 -cutout_rect_freq = 25 - - -cutout_x_regions = 0 -cutout_y_regions = 0 -cutout_x_width = 6 -cutout_y_width = 6 - - -[input_eval] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 - - -[encoder] -activation = "relu" -convmask = true - -[[jasper]] -filters = 256 -repeat = 1 -kernel = [11] -stride = [2] -dilation = [1] -dropout = 0.2 -residual = false - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 896 -repeat = 1 -kernel = [29] -stride = [1] -dilation = [2] -dropout = 0.4 -residual = false - -[[jasper]] -filters = 1024 -repeat = 1 -kernel = [1] -stride = [1] -dilation = [1] -dropout = 0.4 -residual = false - -[labels] -labels = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_sp_offline_specaugment.toml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_sp_offline_specaugment.toml deleted file mode 100644 index d01dc51c..00000000 --- a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_sp_offline_specaugment.toml +++ /dev/null @@ -1,204 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -model = "Jasper" - -[input] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 -max_duration = 16.7 -speed_perturbation = true - - -cutout_rect_regions = 0 -cutout_rect_time = 60 -cutout_rect_freq = 25 - - -cutout_x_regions = 2 -cutout_y_regions = 2 -cutout_x_width = 6 -cutout_y_width = 6 - - -[input_eval] -normalize = "per_feature" -sample_rate = 16000 -window_size = 0.02 -window_stride = 0.01 -window = "hann" -features = 64 -n_fft = 512 -frame_splicing = 1 -dither = 0.00001 -feat_type = "logfbank" -normalize_transcripts = true -trim_silence = true -pad_to = 16 - - -[encoder] -activation = "relu" -convmask = true - -[[jasper]] -filters = 256 -repeat = 1 -kernel = [11] -stride = [2] -dilation = [1] -dropout = 0.2 -residual = false - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 256 -repeat = 5 -kernel = [11] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 384 -repeat = 5 -kernel = [13] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 512 -repeat = 5 -kernel = [17] -stride = [1] -dilation = [1] -dropout = 0.2 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 640 -repeat = 5 -kernel = [21] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 768 -repeat = 5 -kernel = [25] -stride = [1] -dilation = [1] -dropout = 0.3 -residual = true -residual_dense = true - - -[[jasper]] -filters = 896 -repeat = 1 -kernel = [29] -stride = [1] -dilation = [2] -dropout = 0.4 -residual = false - -[[jasper]] -filters = 1024 -repeat = 1 -kernel = [1] -stride = [1] -dilation = [1] -dropout = 0.4 -residual = false - -[labels] -labels = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speca.yaml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speca.yaml new file mode 100644 index 00000000..b0c0d5b9 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speca.yaml @@ -0,0 +1,139 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: "Jasper" +labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", + "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] + +input_val: + audio_dataset: &val_dataset + sample_rate: &sample_rate 16000 + trim_silence: true + normalize_transcripts: true + + filterbank_features: &val_features + normalize: per_feature + sample_rate: *sample_rate + window_size: 0.02 + window_stride: 0.01 + window: hann + n_filt: &n_filt 64 + n_fft: 512 + frame_splicing: &frame_splicing 1 + dither: 0.00001 + pad_align: 16 + +# For training we keep samples < 16.7s and apply augmentation +input_train: + audio_dataset: + <<: *val_dataset + max_duration: 16.7 + ignore_offline_speed_perturbation: true + + filterbank_features: + <<: *val_features + max_duration: 16.7 + + spec_augment: + freq_masks: 2 + max_freq: 20 + time_masks: 2 + max_time: 75 + +jasper: + encoder: + init: xavier_uniform + in_feats: *n_filt + frame_splicing: *frame_splicing + activation: relu + use_conv_masks: true + blocks: + - &Conv1 + filters: 256 + repeat: 1 + kernel_size: [11] + stride: [2] + dilation: [1] + dropout: 0.2 + residual: false + - &B1 + filters: 256 + repeat: 5 + kernel_size: [11] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B1 + - &B2 + filters: 384 + repeat: 5 + kernel_size: [13] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B2 + - &B3 + filters: 512 + repeat: 5 + kernel_size: [17] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B3 + - &B4 + filters: 640 + repeat: 5 + kernel_size: [21] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B4 + - &B5 + filters: 768 + repeat: 5 + kernel_size: [25] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B5 + - &Conv2 + filters: 896 + repeat: 1 + kernel_size: [29] + stride: [1] + dilation: [2] + dropout: 0.4 + residual: false + - &Conv3 + filters: &enc_feats 1024 + repeat: 1 + kernel_size: [1] + stride: [1] + dilation: [1] + dropout: 0.4 + residual: false + + decoder: + in_feats: *enc_feats + init: xavier_uniform diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline.yaml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline.yaml new file mode 100644 index 00000000..89c135ea --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline.yaml @@ -0,0 +1,139 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: "Jasper" +labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", + "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] + +input_val: + audio_dataset: &val_dataset + sample_rate: &sample_rate 16000 + trim_silence: true + normalize_transcripts: true + + filterbank_features: &val_features + normalize: per_feature + sample_rate: *sample_rate + window_size: 0.02 + window_stride: 0.01 + window: hann + n_filt: &n_filt 64 + n_fft: 512 + frame_splicing: &frame_splicing 1 + dither: 0.00001 + pad_align: 16 + +# For training we keep samples < 16.7s and apply augmentation +input_train: + audio_dataset: + <<: *val_dataset + max_duration: 16.7 + ignore_offline_speed_perturbation: false + + filterbank_features: + <<: *val_features + max_duration: 16.7 + + spec_augment: + freq_masks: 0 + max_freq: 20 + time_masks: 0 + max_time: 75 + +jasper: + encoder: + init: xavier_uniform + in_feats: *n_filt + frame_splicing: *frame_splicing + activation: relu + use_conv_masks: true + blocks: + - &Conv1 + filters: 256 + repeat: 1 + kernel_size: [11] + stride: [2] + dilation: [1] + dropout: 0.2 + residual: false + - &B1 + filters: 256 + repeat: 5 + kernel_size: [11] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B1 + - &B2 + filters: 384 + repeat: 5 + kernel_size: [13] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B2 + - &B3 + filters: 512 + repeat: 5 + kernel_size: [17] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B3 + - &B4 + filters: 640 + repeat: 5 + kernel_size: [21] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B4 + - &B5 + filters: 768 + repeat: 5 + kernel_size: [25] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B5 + - &Conv2 + filters: 896 + repeat: 1 + kernel_size: [29] + stride: [1] + dilation: [2] + dropout: 0.4 + residual: false + - &Conv3 + filters: &enc_feats 1024 + repeat: 1 + kernel_size: [1] + stride: [1] + dilation: [1] + dropout: 0.4 + residual: false + + decoder: + in_feats: *enc_feats + init: xavier_uniform diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline_speca.yaml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline_speca.yaml new file mode 100644 index 00000000..2c7e4581 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline_speca.yaml @@ -0,0 +1,139 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: "Jasper" +labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", + "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] + +input_val: + audio_dataset: &val_dataset + sample_rate: &sample_rate 16000 + trim_silence: true + normalize_transcripts: true + + filterbank_features: &val_features + normalize: per_feature + sample_rate: *sample_rate + window_size: 0.02 + window_stride: 0.01 + window: hann + n_filt: &n_filt 64 + n_fft: 512 + frame_splicing: &frame_splicing 1 + dither: 0.00001 + pad_align: 16 + +# For training we keep samples < 16.7s and apply augmentation +input_train: + audio_dataset: + <<: *val_dataset + max_duration: 16.7 + ignore_offline_speed_perturbation: false + + filterbank_features: + <<: *val_features + max_duration: 16.7 + + spec_augment: + freq_masks: 2 + max_freq: 20 + time_masks: 2 + max_time: 75 + +jasper: + encoder: + init: xavier_uniform + in_feats: *n_filt + frame_splicing: *frame_splicing + activation: relu + use_conv_masks: true + blocks: + - &Conv1 + filters: 256 + repeat: 1 + kernel_size: [11] + stride: [2] + dilation: [1] + dropout: 0.2 + residual: false + - &B1 + filters: 256 + repeat: 5 + kernel_size: [11] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B1 + - &B2 + filters: 384 + repeat: 5 + kernel_size: [13] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B2 + - &B3 + filters: 512 + repeat: 5 + kernel_size: [17] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B3 + - &B4 + filters: 640 + repeat: 5 + kernel_size: [21] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B4 + - &B5 + filters: 768 + repeat: 5 + kernel_size: [25] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B5 + - &Conv2 + filters: 896 + repeat: 1 + kernel_size: [29] + stride: [1] + dilation: [2] + dropout: 0.4 + residual: false + - &Conv3 + filters: &enc_feats 1024 + repeat: 1 + kernel_size: [1] + stride: [1] + dilation: [1] + dropout: 0.4 + residual: false + + decoder: + in_feats: *enc_feats + init: xavier_uniform diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline_speca_nomask.yaml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline_speca_nomask.yaml new file mode 100644 index 00000000..61619428 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-offline_speca_nomask.yaml @@ -0,0 +1,139 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: "Jasper" +labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", + "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] + +input_val: + audio_dataset: &val_dataset + sample_rate: &sample_rate 16000 + trim_silence: true + normalize_transcripts: true + + filterbank_features: &val_features + normalize: per_feature + sample_rate: *sample_rate + window_size: 0.02 + window_stride: 0.01 + window: hann + n_filt: &n_filt 64 + n_fft: 512 + frame_splicing: &frame_splicing 1 + dither: 0.00001 + pad_align: 16 + +# For training we keep samples < 16.7s and apply augmentation +input_train: + audio_dataset: + <<: *val_dataset + max_duration: 16.7 + ignore_offline_speed_perturbation: false + + filterbank_features: + <<: *val_features + max_duration: 16.7 + + spec_augment: + freq_masks: 2 + max_freq: 20 + time_masks: 2 + max_time: 75 + +jasper: + encoder: + init: xavier_uniform + in_feats: *n_filt + frame_splicing: *frame_splicing + activation: relu + use_conv_masks: false + blocks: + - &Conv1 + filters: 256 + repeat: 1 + kernel_size: [11] + stride: [2] + dilation: [1] + dropout: 0.2 + residual: false + - &B1 + filters: 256 + repeat: 5 + kernel_size: [11] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B1 + - &B2 + filters: 384 + repeat: 5 + kernel_size: [13] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B2 + - &B3 + filters: 512 + repeat: 5 + kernel_size: [17] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B3 + - &B4 + filters: 640 + repeat: 5 + kernel_size: [21] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B4 + - &B5 + filters: 768 + repeat: 5 + kernel_size: [25] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B5 + - &Conv2 + filters: 896 + repeat: 1 + kernel_size: [29] + stride: [1] + dilation: [2] + dropout: 0.4 + residual: false + - &Conv3 + filters: &enc_feats 1024 + repeat: 1 + kernel_size: [1] + stride: [1] + dilation: [1] + dropout: 0.4 + residual: false + + decoder: + in_feats: *enc_feats + init: xavier_uniform diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online-discrete.yaml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online-discrete.yaml new file mode 100644 index 00000000..c0c59e19 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online-discrete.yaml @@ -0,0 +1,144 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: "Jasper" +labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", + "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] + +input_val: + audio_dataset: &val_dataset + sample_rate: &sample_rate 16000 + trim_silence: true + normalize_transcripts: true + + filterbank_features: &val_features + normalize: per_feature + sample_rate: *sample_rate + window_size: 0.02 + window_stride: 0.01 + window: hann + n_filt: &n_filt 64 + n_fft: 512 + frame_splicing: &frame_splicing 1 + dither: 0.00001 + pad_align: 16 + +# For training we keep samples < 16.7s and apply augmentation +input_train: + audio_dataset: + <<: *val_dataset + max_duration: 16.7 + ignore_offline_speed_perturbation: true + + speed_perturbation: + discrete: true + min_rate: 0.9 + max_rate: 1.1 + + filterbank_features: + <<: *val_features + max_duration: 16.7 + + spec_augment: + freq_masks: 0 + max_freq: 20 + time_masks: 0 + max_time: 75 + +jasper: + encoder: + init: xavier_uniform + in_feats: *n_filt + frame_splicing: *frame_splicing + activation: relu + use_conv_masks: true + blocks: + - &Conv1 + filters: 256 + repeat: 1 + kernel_size: [11] + stride: [2] + dilation: [1] + dropout: 0.2 + residual: false + - &B1 + filters: 256 + repeat: 5 + kernel_size: [11] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B1 + - &B2 + filters: 384 + repeat: 5 + kernel_size: [13] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B2 + - &B3 + filters: 512 + repeat: 5 + kernel_size: [17] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B3 + - &B4 + filters: 640 + repeat: 5 + kernel_size: [21] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B4 + - &B5 + filters: 768 + repeat: 5 + kernel_size: [25] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B5 + - &Conv2 + filters: 896 + repeat: 1 + kernel_size: [29] + stride: [1] + dilation: [2] + dropout: 0.4 + residual: false + - &Conv3 + filters: &enc_feats 1024 + repeat: 1 + kernel_size: [1] + stride: [1] + dilation: [1] + dropout: 0.4 + residual: false + + decoder: + in_feats: *enc_feats + init: xavier_uniform diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online-discrete_speca.yaml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online-discrete_speca.yaml new file mode 100644 index 00000000..d2491b30 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online-discrete_speca.yaml @@ -0,0 +1,144 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: "Jasper" +labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", + "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] + +input_val: + audio_dataset: &val_dataset + sample_rate: &sample_rate 16000 + trim_silence: true + normalize_transcripts: true + + filterbank_features: &val_features + normalize: per_feature + sample_rate: *sample_rate + window_size: 0.02 + window_stride: 0.01 + window: hann + n_filt: &n_filt 64 + n_fft: 512 + frame_splicing: &frame_splicing 1 + dither: 0.00001 + pad_align: 16 + +# For training we keep samples < 16.7s and apply augmentation +input_train: + audio_dataset: + <<: *val_dataset + max_duration: 16.7 + ignore_offline_speed_perturbation: true + + speed_perturbation: + discrete: true + min_rate: 0.9 + max_rate: 1.1 + + filterbank_features: + <<: *val_features + max_duration: 16.7 + + spec_augment: + freq_masks: 2 + max_freq: 20 + time_masks: 2 + max_time: 75 + +jasper: + encoder: + init: xavier_uniform + in_feats: *n_filt + frame_splicing: *frame_splicing + activation: relu + use_conv_masks: true + blocks: + - &Conv1 + filters: 256 + repeat: 1 + kernel_size: [11] + stride: [2] + dilation: [1] + dropout: 0.2 + residual: false + - &B1 + filters: 256 + repeat: 5 + kernel_size: [11] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B1 + - &B2 + filters: 384 + repeat: 5 + kernel_size: [13] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B2 + - &B3 + filters: 512 + repeat: 5 + kernel_size: [17] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B3 + - &B4 + filters: 640 + repeat: 5 + kernel_size: [21] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B4 + - &B5 + filters: 768 + repeat: 5 + kernel_size: [25] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B5 + - &Conv2 + filters: 896 + repeat: 1 + kernel_size: [29] + stride: [1] + dilation: [2] + dropout: 0.4 + residual: false + - &Conv3 + filters: &enc_feats 1024 + repeat: 1 + kernel_size: [1] + stride: [1] + dilation: [1] + dropout: 0.4 + residual: false + + decoder: + in_feats: *enc_feats + init: xavier_uniform diff --git a/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online_speca.yaml b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online_speca.yaml new file mode 100644 index 00000000..a165af7f --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/configs/jasper10x5dr_speedp-online_speca.yaml @@ -0,0 +1,144 @@ +# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: "Jasper" +labels: [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", + "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "'"] + +input_val: + audio_dataset: &val_dataset + sample_rate: &sample_rate 16000 + trim_silence: true + normalize_transcripts: true + + filterbank_features: &val_features + normalize: per_feature + sample_rate: *sample_rate + window_size: 0.02 + window_stride: 0.01 + window: hann + n_filt: &n_filt 64 + n_fft: 512 + frame_splicing: &frame_splicing 1 + dither: 0.00001 + pad_align: 16 + +# For training we keep samples < 16.7s and apply augmentation +input_train: + audio_dataset: + <<: *val_dataset + max_duration: 16.7 + ignore_offline_speed_perturbation: true + + speed_perturbation: + discrete: false + min_rate: 0.85 + max_rate: 1.15 + + filterbank_features: + <<: *val_features + max_duration: 16.7 + + spec_augment: + freq_masks: 2 + max_freq: 20 + time_masks: 2 + max_time: 75 + +jasper: + encoder: + init: xavier_uniform + in_feats: *n_filt + frame_splicing: *frame_splicing + activation: relu + use_conv_masks: true + blocks: + - &Conv1 + filters: 256 + repeat: 1 + kernel_size: [11] + stride: [2] + dilation: [1] + dropout: 0.2 + residual: false + - &B1 + filters: 256 + repeat: 5 + kernel_size: [11] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B1 + - &B2 + filters: 384 + repeat: 5 + kernel_size: [13] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B2 + - &B3 + filters: 512 + repeat: 5 + kernel_size: [17] + stride: [1] + dilation: [1] + dropout: 0.2 + residual: true + residual_dense: true + - *B3 + - &B4 + filters: 640 + repeat: 5 + kernel_size: [21] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B4 + - &B5 + filters: 768 + repeat: 5 + kernel_size: [25] + stride: [1] + dilation: [1] + dropout: 0.3 + residual: true + residual_dense: true + - *B5 + - &Conv2 + filters: 896 + repeat: 1 + kernel_size: [29] + stride: [1] + dilation: [2] + dropout: 0.4 + residual: false + - &Conv3 + filters: &enc_feats 1024 + repeat: 1 + kernel_size: [1] + stride: [1] + dilation: [1] + dropout: 0.4 + residual: false + + decoder: + in_feats: *enc_feats + init: xavier_uniform diff --git a/PyTorch/SpeechRecognition/Jasper/dataset.py b/PyTorch/SpeechRecognition/Jasper/dataset.py deleted file mode 100644 index ad88d2f0..00000000 --- a/PyTorch/SpeechRecognition/Jasper/dataset.py +++ /dev/null @@ -1,266 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -""" -This file contains classes and functions related to data loading -""" -import torch -import numpy as np -import math -from torch.utils.data import Dataset, Sampler -import torch.distributed as dist -from parts.manifest import Manifest -from parts.features import WaveformFeaturizer - -class DistributedBucketBatchSampler(Sampler): - def __init__(self, dataset, batch_size, num_replicas=None, rank=None): - """Distributed sampler that buckets samples with similar length to minimize padding, - similar concept as pytorch BucketBatchSampler https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.samplers.html#torchnlp.samplers.BucketBatchSampler - - Args: - dataset: Dataset used for sampling. - batch_size: data batch size - num_replicas (optional): Number of processes participating in - distributed training. - rank (optional): Rank of the current process within num_replicas. - """ - if num_replicas is None: - if not dist.is_available(): - raise RuntimeError("Requires distributed package to be available") - num_replicas = dist.get_world_size() - if rank is None: - if not dist.is_available(): - raise RuntimeError("Requires distributed package to be available") - rank = dist.get_rank() - self.dataset = dataset - self.dataset_size = len(dataset) - self.num_replicas = num_replicas - self.rank = rank - self.epoch = 0 - self.batch_size = batch_size - self.tile_size = batch_size * self.num_replicas - self.num_buckets = 6 - self.bucket_size = self.round_up_to(math.ceil(self.dataset_size / self.num_buckets), self.tile_size) - self.index_count = self.round_up_to(self.dataset_size, self.tile_size) - self.num_samples = self.index_count // self.num_replicas - - def round_up_to(self, x, mod): - return (x + mod - 1) // mod * mod - - def __iter__(self): - g = torch.Generator() - g.manual_seed(self.epoch) - indices = np.arange(self.index_count) % self.dataset_size - for bucket in range(self.num_buckets): - bucket_start = self.bucket_size * bucket - bucket_end = min(bucket_start + self.bucket_size, self.index_count) - indices[bucket_start:bucket_end] = indices[bucket_start:bucket_end][torch.randperm(bucket_end - bucket_start, generator=g)] - - tile_indices = torch.randperm(self.index_count // self.tile_size, generator=g) - for tile_index in tile_indices: - start_index = self.tile_size * tile_index + self.batch_size * self.rank - end_index = start_index + self.batch_size - yield indices[start_index:end_index] - - def __len__(self): - return self.num_samples - - def set_epoch(self, epoch): - self.epoch = epoch - -class data_prefetcher(): - def __init__(self, loader): - self.loader = iter(loader) - self.stream = torch.cuda.Stream() - self.preload() - - def preload(self): - try: - self.next_input = next(self.loader) - except StopIteration: - self.next_input = None - return - with torch.cuda.stream(self.stream): - self.next_input = [ x.cuda(non_blocking=True) for x in self.next_input] - - def __next__(self): - torch.cuda.current_stream().wait_stream(self.stream) - input = self.next_input - self.preload() - return input - def next(self): - return self.__next__() - def __iter__(self): - return self - -def seq_collate_fn(batch): - """batches samples and returns as tensors - Args: - batch : list of samples - Returns - batches of tensors - """ - batch_size = len(batch) - def _find_max_len(lst, ind): - max_len = -1 - for item in lst: - if item[ind].size(0) > max_len: - max_len = item[ind].size(0) - return max_len - max_audio_len = _find_max_len(batch, 0) - max_transcript_len = _find_max_len(batch, 2) - - batched_audio_signal = torch.zeros(batch_size, max_audio_len) - batched_transcript = torch.zeros(batch_size, max_transcript_len) - audio_lengths = [] - transcript_lengths = [] - for ind, sample in enumerate(batch): - batched_audio_signal[ind].narrow(0, 0, sample[0].size(0)).copy_(sample[0]) - audio_lengths.append(sample[1]) - batched_transcript[ind].narrow(0, 0, sample[2].size(0)).copy_(sample[2]) - transcript_lengths.append(sample[3]) - return batched_audio_signal, torch.stack(audio_lengths), batched_transcript, \ - torch.stack(transcript_lengths) - -class AudioToTextDataLayer: - """Data layer with data loader - """ - def __init__(self, **kwargs): - self._device = torch.device("cuda") - - featurizer_config = kwargs['featurizer_config'] - pad_to_max = kwargs.get('pad_to_max', False) - perturb_config = kwargs.get('perturb_config', None) - manifest_filepath = kwargs['manifest_filepath'] - dataset_dir = kwargs['dataset_dir'] - labels = kwargs['labels'] - batch_size = kwargs['batch_size'] - drop_last = kwargs.get('drop_last', False) - shuffle = kwargs.get('shuffle', True) - min_duration = featurizer_config.get('min_duration', 0.1) - max_duration = featurizer_config.get('max_duration', None) - normalize_transcripts = kwargs.get('normalize_transcripts', True) - trim_silence = kwargs.get('trim_silence', False) - multi_gpu = kwargs.get('multi_gpu', False) - sampler_type = kwargs.get('sampler', 'default') - speed_perturbation = featurizer_config.get('speed_perturbation', False) - sort_by_duration=sampler_type == 'bucket' - self._featurizer = WaveformFeaturizer.from_config(featurizer_config, perturbation_configs=perturb_config) - self._dataset = AudioDataset( - dataset_dir=dataset_dir, - manifest_filepath=manifest_filepath, - labels=labels, blank_index=len(labels), - sort_by_duration=sort_by_duration, - pad_to_max=pad_to_max, - featurizer=self._featurizer, max_duration=max_duration, - min_duration=min_duration, normalize=normalize_transcripts, - trim=trim_silence, speed_perturbation=speed_perturbation) - - print('sort_by_duration', sort_by_duration) - - if not multi_gpu: - self.sampler = None - self._dataloader = torch.utils.data.DataLoader( - dataset=self._dataset, - batch_size=batch_size, - collate_fn=lambda b: seq_collate_fn(b), - drop_last=drop_last, - shuffle=shuffle if self.sampler is None else False, - num_workers=4, - pin_memory=True, - sampler=self.sampler - ) - elif sampler_type == 'bucket': - self.sampler = DistributedBucketBatchSampler(self._dataset, batch_size=batch_size) - print("DDBucketSampler") - self._dataloader = torch.utils.data.DataLoader( - dataset=self._dataset, - collate_fn=lambda b: seq_collate_fn(b), - num_workers=4, - pin_memory=True, - batch_sampler=self.sampler - ) - elif sampler_type == 'default': - self.sampler = torch.utils.data.distributed.DistributedSampler(self._dataset) - print("DDSampler") - self._dataloader = torch.utils.data.DataLoader( - dataset=self._dataset, - batch_size=batch_size, - collate_fn=lambda b: seq_collate_fn(b), - drop_last=drop_last, - shuffle=shuffle if self.sampler is None else False, - num_workers=4, - pin_memory=True, - sampler=self.sampler - ) - else: - raise RuntimeError("Sampler {} not supported".format(sampler_type)) - - def __len__(self): - return len(self._dataset) - - @property - def data_iterator(self): - return self._dataloader - -class AudioDataset(Dataset): - def __init__(self, dataset_dir, manifest_filepath, labels, featurizer, max_duration=None, pad_to_max=False, - min_duration=None, blank_index=0, max_utts=0, normalize=True, sort_by_duration=False, - trim=False, speed_perturbation=False): - """Dataset that loads tensors via a json file containing paths to audio files, transcripts, and durations - (in seconds). Each entry is a different audio sample. - Args: - dataset_dir: absolute path to dataset folder - manifest_filepath: relative path from dataset folder to manifest json as described above. Can be coma-separated paths. - labels: String containing all the possible characters to map to - featurizer: Initialized featurizer class that converts paths of audio to feature tensors - max_duration: If audio exceeds this length, do not include in dataset - min_duration: If audio is less than this length, do not include in dataset - pad_to_max: if specified input sequences into dnn model will be padded to max_duration - blank_index: blank index for ctc loss / decoder - max_utts: Limit number of utterances - normalize: whether to normalize transcript text - sort_by_duration: whether or not to sort sequences by increasing duration - trim: if specified trims leading and trailing silence from an audio signal. - speed_perturbation: specify if using data contains speed perburbation - """ - m_paths = manifest_filepath.split(',') - self.manifest = Manifest(dataset_dir, m_paths, labels, blank_index, pad_to_max=pad_to_max, - max_duration=max_duration, - sort_by_duration=sort_by_duration, - min_duration=min_duration, max_utts=max_utts, - normalize=normalize, speed_perturbation=speed_perturbation) - self.featurizer = featurizer - self.blank_index = blank_index - self.trim = trim - print( - "Dataset loaded with {0:.2f} hours. Filtered {1:.2f} hours.".format( - self.manifest.duration / 3600, - self.manifest.filtered_duration / 3600)) - - def __getitem__(self, index): - sample = self.manifest[index] - rn_indx = np.random.randint(len(sample['audio_filepath'])) - duration = sample['audio_duration'][rn_indx] if 'audio_duration' in sample else 0 - offset = sample['offset'] if 'offset' in sample else 0 - features = self.featurizer.process(sample['audio_filepath'][rn_indx], - offset=offset, duration=duration, - trim=self.trim) - - return features, torch.tensor(features.shape[0]).int(), \ - torch.tensor(sample["transcript"]), torch.tensor( - len(sample["transcript"])).int() - - def __len__(self): - return len(self.manifest) diff --git a/PyTorch/SpeechRecognition/Jasper/external/Dockerfile.client.patched b/PyTorch/SpeechRecognition/Jasper/external/Dockerfile.client.patched deleted file mode 100644 index 0ac60c5c..00000000 --- a/PyTorch/SpeechRecognition/Jasper/external/Dockerfile.client.patched +++ /dev/null @@ -1,95 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted provided that the following conditions -# are met: -# * Redistributions of source code must retain the above copyright -# notice, this list of conditions and the following disclaimer. -# * Redistributions in binary form must reproduce the above copyright -# notice, this list of conditions and the following disclaimer in the -# documentation and/or other materials provided with the distribution. -# * Neither the name of NVIDIA CORPORATION nor the names of its -# contributors may be used to endorse or promote products derived -# from this software without specific prior written permission. -# -# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY -# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR -# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR -# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, -# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, -# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR -# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY -# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - -# Default setting is building on nvidia/cuda:10.1-devel-ubuntu18.04 -ARG BASE_IMAGE=nvidia/cuda:10.1-devel-ubuntu18.04 - -FROM ${BASE_IMAGE} - -# Default to use Python3. Allowed values are "2" and "3". -ARG PYVER=3 - -# Ensure apt-get won't prompt for selecting options -ENV DEBIAN_FRONTEND=noninteractive -ENV PYVER=$PYVER - -RUN PYSFX=`[ "$PYVER" != "2" ] && echo "$PYVER" || echo ""` && \ - apt-get update && \ - apt-get install -y --no-install-recommends \ - software-properties-common \ - autoconf \ - automake \ - build-essential \ - cmake \ - curl \ - git \ - libopencv-dev \ - libopencv-core-dev \ - libssl-dev \ - libtool \ - pkg-config \ - python${PYSFX} \ - python${PYSFX}-pip \ - python${PYSFX}-dev && \ - pip${PYSFX} install --upgrade setuptools wheel - -RUN PYSFX=`[ "$PYVER" != "2" ] && echo "$PYVER" || echo ""` && \ - pip${PYSFX} install --upgrade grpcio-tools - -# Build expects "python" executable (not python3). -RUN rm -f /usr/bin/python && \ - ln -s /usr/bin/python$PYVER /usr/bin/python - -# Build the client library and examples -WORKDIR /workspace -COPY VERSION . -COPY build build -COPY src/clients src/clients -COPY src/core src/core - -RUN cd build && \ - cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX:PATH=/workspace/install && \ - make -j16 trtis-clients -RUN cd install && \ - export VERSION=`cat /workspace/VERSION` && \ - tar zcf /workspace/v$VERSION.clients.tar.gz * - -# For CI testing need to install a test script. -COPY qa/L0_client_tar/test.sh /tmp/test.sh - -# Install an image needed by the quickstart and other documentation. -COPY qa/images/mug.jpg images/mug.jpg - -# Install the dependencies needed to run the client examples. These -# are not needed for building but including them allows this image to -# be used to run the client examples. The special upgrade and handling -# of pip is needed to get numpy to install correctly with python2 on -# ubuntu 16.04. -RUN python -m pip install --user --upgrade pip && \ - python -m pip install --upgrade install/python/tensorrtserver-*.whl numpy pillow - -ENV PATH //workspace/install/bin:${PATH} -ENV LD_LIBRARY_PATH /workspace/install/lib:${LD_LIBRARY_PATH} diff --git a/PyTorch/SpeechRecognition/Jasper/external/triton-inference-server b/PyTorch/SpeechRecognition/Jasper/external/triton-inference-server deleted file mode 160000 index a1f3860b..00000000 --- a/PyTorch/SpeechRecognition/Jasper/external/triton-inference-server +++ /dev/null @@ -1 +0,0 @@ -Subproject commit a1f3860ba65c0fd8f2be3adfcab2673efd039348 diff --git a/PyTorch/SpeechRecognition/Jasper/helpers.py b/PyTorch/SpeechRecognition/Jasper/helpers.py deleted file mode 100644 index 23606453..00000000 --- a/PyTorch/SpeechRecognition/Jasper/helpers.py +++ /dev/null @@ -1,207 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import torch -import torch.distributed as dist -from apex.parallel import DistributedDataParallel as DDP -from enum import Enum -from metrics import word_error_rate - - -def print_once(msg): - if (not torch.distributed.is_initialized() or (torch.distributed.is_initialized() and torch.distributed.get_rank() == 0)): - print(msg) - -def add_ctc_labels(labels): - if not isinstance(labels, list): - raise ValueError("labels must be a list of symbols") - labels.append("") - return labels - -def __ctc_decoder_predictions_tensor(tensor, labels): - """ - Takes output of greedy ctc decoder and performs ctc decoding algorithm to - remove duplicates and special symbol. Returns prediction - Args: - tensor: model output tensor - label: A list of labels - Returns: - prediction - """ - blank_id = len(labels) - 1 - hypotheses = [] - labels_map = dict([(i, labels[i]) for i in range(len(labels))]) - prediction_cpu_tensor = tensor.long().cpu() - # iterate over batch - for ind in range(prediction_cpu_tensor.shape[0]): - prediction = prediction_cpu_tensor[ind].numpy().tolist() - # CTC decoding procedure - decoded_prediction = [] - previous = len(labels) - 1 # id of a blank symbol - for p in prediction: - if (p != previous or previous == blank_id) and p != blank_id: - decoded_prediction.append(p) - previous = p - hypothesis = ''.join([labels_map[c] for c in decoded_prediction]) - hypotheses.append(hypothesis) - return hypotheses - - -def monitor_asr_train_progress(tensors: list, labels: list): - """ - Takes output of greedy ctc decoder and performs ctc decoding algorithm to - remove duplicates and special symbol. Prints wer and prediction examples to screen - Args: - tensors: A list of 3 tensors (predictions, targets, target_lengths) - labels: A list of labels - - Returns: - word error rate - """ - references = [] - - labels_map = dict([(i, labels[i]) for i in range(len(labels))]) - with torch.no_grad(): - targets_cpu_tensor = tensors[1].long().cpu() - tgt_lenths_cpu_tensor = tensors[2].long().cpu() - - # iterate over batch - for ind in range(targets_cpu_tensor.shape[0]): - tgt_len = tgt_lenths_cpu_tensor[ind].item() - target = targets_cpu_tensor[ind][:tgt_len].numpy().tolist() - reference = ''.join([labels_map[c] for c in target]) - references.append(reference) - hypotheses = __ctc_decoder_predictions_tensor(tensors[0], labels=labels) - tag = "training_batch_WER" - wer, _, _ = word_error_rate(hypotheses, references) - print_once('{0}: {1}'.format(tag, wer)) - print_once('Prediction: {0}'.format(hypotheses[0])) - print_once('Reference: {0}'.format(references[0])) - return wer - - -def __gather_losses(losses_list: list) -> list: - return [torch.mean(torch.stack(losses_list))] - - -def __gather_predictions(predictions_list: list, labels: list) -> list: - results = [] - for prediction in predictions_list: - results += __ctc_decoder_predictions_tensor(prediction, labels=labels) - return results - - -def __gather_transcripts(transcript_list: list, transcript_len_list: list, - labels: list) -> list: - results = [] - labels_map = dict([(i, labels[i]) for i in range(len(labels))]) - # iterate over workers - for t, ln in zip(transcript_list, transcript_len_list): - # iterate over batch - t_lc = t.long().cpu() - ln_lc = ln.long().cpu() - for ind in range(t.shape[0]): - tgt_len = ln_lc[ind].item() - target = t_lc[ind][:tgt_len].numpy().tolist() - reference = ''.join([labels_map[c] for c in target]) - results.append(reference) - return results - - -def process_evaluation_batch(tensors: dict, global_vars: dict, labels: list): - """ - Processes results of an iteration and saves it in global_vars - Args: - tensors: dictionary with results of an evaluation iteration, e.g. loss, predictions, transcript, and output - global_vars: dictionary where processes results of iteration are saved - labels: A list of labels - """ - for kv, v in tensors.items(): - if kv.startswith('loss'): - global_vars['EvalLoss'] += __gather_losses(v) - elif kv.startswith('predictions'): - global_vars['predictions'] += __gather_predictions(v, labels=labels) - elif kv.startswith('transcript_length'): - transcript_len_list = v - elif kv.startswith('transcript'): - - transcript_list = v - elif kv.startswith('output'): - global_vars['logits'] += v - - global_vars['transcripts'] += __gather_transcripts(transcript_list, - transcript_len_list, - labels=labels) - - -def process_evaluation_epoch(global_vars: dict, tag=None): - """ - Processes results from each worker at the end of evaluation and combine to final result - Args: - global_vars: dictionary containing information of entire evaluation - Return: - wer: final word error rate - loss: final loss - """ - if 'EvalLoss' in global_vars: - eloss = torch.mean(torch.stack(global_vars['EvalLoss'])).item() - else: - eloss = None - hypotheses = global_vars['predictions'] - references = global_vars['transcripts'] - - wer, scores, num_words = word_error_rate(hypotheses=hypotheses, references=references) - multi_gpu = torch.distributed.is_initialized() - if multi_gpu: - if eloss is not None: - eloss /= torch.distributed.get_world_size() - eloss_tensor = torch.tensor(eloss).cuda() - dist.all_reduce(eloss_tensor) - eloss = eloss_tensor.item() - del eloss_tensor - - scores_tensor = torch.tensor(scores).cuda() - dist.all_reduce(scores_tensor) - scores = scores_tensor.item() - del scores_tensor - num_words_tensor = torch.tensor(num_words).cuda() - dist.all_reduce(num_words_tensor) - num_words = num_words_tensor.item() - del num_words_tensor - wer = scores *1.0/num_words - return wer, eloss - - - -def norm(x): - if not isinstance(x, list): - if not isinstance(x, tuple): - return x - return x[0] - - -def print_dict(d): - maxLen = max([len(ii) for ii in d.keys()]) - fmtString = '\t%' + str(maxLen) + 's : %s' - print('Arguments:') - for keyPair in sorted(d.items()): - print(fmtString % keyPair) - - - -def model_multi_gpu(model, multi_gpu=False): - if multi_gpu: - model = DDP(model) - print('DDP(model)') - return model diff --git a/PyTorch/SpeechRecognition/Jasper/images/static_fp16_16.7s.png b/PyTorch/SpeechRecognition/Jasper/images/static_fp16_16.7s.png new file mode 100644 index 00000000..8afdb6c3 Binary files /dev/null and b/PyTorch/SpeechRecognition/Jasper/images/static_fp16_16.7s.png differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/static_fp16_2s.png b/PyTorch/SpeechRecognition/Jasper/images/static_fp16_2s.png new file mode 100644 index 00000000..cdd5a433 Binary files /dev/null and b/PyTorch/SpeechRecognition/Jasper/images/static_fp16_2s.png differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/static_fp16_7s.png b/PyTorch/SpeechRecognition/Jasper/images/static_fp16_7s.png new file mode 100644 index 00000000..bcd14f50 Binary files /dev/null and b/PyTorch/SpeechRecognition/Jasper/images/static_fp16_7s.png differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/tensorrt_16.7s.png b/PyTorch/SpeechRecognition/Jasper/images/tensorrt_16.7s.png new file mode 100644 index 00000000..a07b7387 Binary files /dev/null and b/PyTorch/SpeechRecognition/Jasper/images/tensorrt_16.7s.png differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/tensorrt_2s.png b/PyTorch/SpeechRecognition/Jasper/images/tensorrt_2s.png new file mode 100644 index 00000000..04df03ff Binary files /dev/null and b/PyTorch/SpeechRecognition/Jasper/images/tensorrt_2s.png differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/tensorrt_7s.png b/PyTorch/SpeechRecognition/Jasper/images/tensorrt_7s.png new file mode 100644 index 00000000..a57b95f3 Binary files /dev/null and b/PyTorch/SpeechRecognition/Jasper/images/tensorrt_7s.png differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/triton_dynamic_batching.png b/PyTorch/SpeechRecognition/Jasper/images/triton_dynamic_batching.png deleted file mode 100644 index 12116bf2..00000000 Binary files a/PyTorch/SpeechRecognition/Jasper/images/triton_dynamic_batching.png and /dev/null differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/triton_static_batching_bs1.png b/PyTorch/SpeechRecognition/Jasper/images/triton_static_batching_bs1.png deleted file mode 100644 index 417a11ab..00000000 Binary files a/PyTorch/SpeechRecognition/Jasper/images/triton_static_batching_bs1.png and /dev/null differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/triton_static_batching_bs8.png b/PyTorch/SpeechRecognition/Jasper/images/triton_static_batching_bs8.png deleted file mode 100644 index ad1cb14c..00000000 Binary files a/PyTorch/SpeechRecognition/Jasper/images/triton_static_batching_bs8.png and /dev/null differ diff --git a/PyTorch/SpeechRecognition/Jasper/images/triton_throughput_latency_summary.png b/PyTorch/SpeechRecognition/Jasper/images/triton_throughput_latency_summary.png deleted file mode 100644 index 295946cf..00000000 Binary files a/PyTorch/SpeechRecognition/Jasper/images/triton_throughput_latency_summary.png and /dev/null differ diff --git a/PyTorch/SpeechRecognition/Jasper/inference.py b/PyTorch/SpeechRecognition/Jasper/inference.py index f4d51a0b..92318985 100644 --- a/PyTorch/SpeechRecognition/Jasper/inference.py +++ b/PyTorch/SpeechRecognition/Jasper/inference.py @@ -13,328 +13,385 @@ # limitations under the License. import argparse -import itertools -from typing import List -from tqdm import tqdm import math -import toml -from dataset import AudioToTextDataLayer -from helpers import process_evaluation_batch, process_evaluation_epoch, add_ctc_labels, print_dict, model_multi_gpu, __ctc_decoder_predictions_tensor -from model import AudioPreprocessing, GreedyCTCDecoder, JasperEncoderDecoder -from parts.features import audio_from_file -import torch -import torch.nn as nn -import apex -from apex import amp -import random -import numpy as np -import pickle -import time import os +import random +import time +from heapq import nlargest +from itertools import chain, repeat +from pathlib import Path +from tqdm import tqdm -def parse_args(): +import dllogger +import torch +import numpy as np +import torch.distributed as distrib +from apex import amp +from apex.parallel import DistributedDataParallel +from dllogger import JSONStreamBackend, StdOutBackend, Verbosity + +from jasper import config +from common import helpers +from common.dali.data_loader import DaliDataLoader +from common.dataset import (AudioDataset, FilelistDataset, get_data_loader, + SingleAudioDataset) +from common.features import BaseFeatures, FilterbankFeatures +from common.helpers import print_once, process_evaluation_epoch +from jasper.model import GreedyCTCDecoder, Jasper +from common.tb_dllogger import stdout_metric_format, unique_log_fpath + + +def get_parser(): parser = argparse.ArgumentParser(description='Jasper') + parser.add_argument('--batch_size', default=16, type=int, + help='Data batch size') + parser.add_argument('--steps', default=0, type=int, + help='Eval this many steps for every worker') + parser.add_argument('--warmup_steps', default=0, type=int, + help='Burn-in period before measuring latencies') + parser.add_argument('--model_config', type=str, + help='Relative model config path given dataset folder') + parser.add_argument('--dataset_dir', type=str, + help='Absolute path to dataset folder') + parser.add_argument('--val_manifests', type=str, nargs='+', + help='Relative path to evaluation dataset manifest files') + parser.add_argument('--ckpt', default=None, type=str, + help='Path to model checkpoint') + parser.add_argument('--max_duration', default=None, type=float, + help='Filter out longer inputs (in seconds)') + parser.add_argument('--pad_to_max_duration', action='store_true', + help='Pads every batch to max_duration') + parser.add_argument('--amp', '--fp16', action='store_true', + help='Use FP16 precision') + parser.add_argument('--cudnn_benchmark', action='store_true', + help='Enable cudnn benchmark') + parser.add_argument('--cpu', action='store_true', + help='Run inference on CPU') + parser.add_argument("--seed", default=None, type=int, help='Random seed') + parser.add_argument('--local_rank', default=os.getenv('LOCAL_RANK', 0), + type=int, help='GPU id used for distributed training') - parser.register("type", "bool", lambda x: x.lower() in ("yes", "true", "t", "1")) - - parser.add_argument("--local_rank", default=None, type=int) - parser.add_argument("--batch_size", default=16, type=int, help='data batch size') - parser.add_argument("--steps", default=None, help='if not specified do evaluation on full dataset. otherwise only evaluates the specified number of iterations for each worker', type=int) - parser.add_argument("--model_toml", type=str, help='relative model configuration path given dataset folder') - parser.add_argument("--dataset_dir", type=str, help='absolute path to dataset folder') - parser.add_argument("--val_manifest", type=str, help='relative path to evaluation dataset manifest file') - parser.add_argument("--ckpt", default=None, type=str, required=True, help='path to model checkpoint') - parser.add_argument("--max_duration", default=None, type=float, help='maximum duration of sequences. if None uses attribute from model configuration file') - parser.add_argument("--pad_to", default=None, type=int, help="default is pad to value as specified in model configurations. if -1 pad to maximum duration. If > 0 pad batch to next multiple of value") - parser.add_argument("--amp", "--fp16", action='store_true', help='use half precision') - parser.add_argument("--cudnn_benchmark", action='store_true', help="enable cudnn benchmark") - parser.add_argument("--save_prediction", type=str, default=None, help="if specified saves predictions in text form at this location") - parser.add_argument("--logits_save_to", default=None, type=str, help="if specified will save logits to path") - parser.add_argument("--seed", default=42, type=int, help='seed') - parser.add_argument("--output_dir", default="results/", type=str, help="Output directory to store exported models. Only used if --export_model is used") - parser.add_argument("--export_model", action='store_true', help="Exports the audio_featurizer, encoder and decoder using torch.jit to the output_dir") - parser.add_argument("--wav", type=str, help='absolute path to .wav file (16KHz)') - parser.add_argument("--cpu", action="store_true", help="Run inference on CPU") - parser.add_argument("--ema", action="store_true", help="If available, load EMA model weights") - - # FIXME Unused, but passed by Triton helper scripts - parser.add_argument("--pyt_fp16", action='store_true', help='use half precision') - - return parser.parse_args() - -def calc_wer(data_layer, audio_processor, - encoderdecoder, greedy_decoder, - labels, args, device): - - encoderdecoder = encoderdecoder.module if hasattr(encoderdecoder, 'module') else encoderdecoder - with torch.no_grad(): - # reset global_var_dict - results of evaluation will be stored there - _global_var_dict = { - 'predictions': [], - 'transcripts': [], - 'logits' : [], - } - - # Evaluation mini-batch for loop - for it, data in enumerate(tqdm(data_layer.data_iterator)): - - tensors = [t.to(device) for t in data] - - t_audio_signal_e, t_a_sig_length_e, t_transcript_e, t_transcript_len_e = tensors - - t_processed_signal = audio_processor(t_audio_signal_e, t_a_sig_length_e) - t_log_probs_e, _ = encoderdecoder.infer(t_processed_signal) - t_predictions_e = greedy_decoder(t_log_probs_e) - - values_dict = dict( - predictions=[t_predictions_e], - transcript=[t_transcript_e], - transcript_length=[t_transcript_len_e], - output=[t_log_probs_e] - ) - # values_dict will contain results from all workers - process_evaluation_batch(values_dict, _global_var_dict, labels=labels) - - if args.steps is not None and it + 1 >= args.steps: - break - - # final aggregation (over minibatches) and logging of results - wer, _ = process_evaluation_epoch(_global_var_dict) - - return wer, _global_var_dict + io = parser.add_argument_group('feature and checkpointing setup') + io.add_argument('--dali_device', type=str, choices=['none', 'cpu', 'gpu'], + default='gpu', help='Use DALI pipeline for fast data processing') + io.add_argument('--save_predictions', type=str, default=None, + help='Save predictions in text form at this location') + io.add_argument('--save_logits', default=None, type=str, + help='Save output logits under specified path') + io.add_argument('--transcribe_wav', type=str, + help='Path to a single .wav file (16KHz)') + io.add_argument('--transcribe_filelist', type=str, + help='Path to a filelist with one .wav path per line') + io.add_argument('-o', '--output_dir', default='results/', + help='Output folder to save audio (file per phrase)') + io.add_argument('--log_file', type=str, default=None, + help='Path to a DLLogger log file') + io.add_argument('--ema', action='store_true', + help='Load averaged model weights') + io.add_argument('--torchscript', action='store_true', + help='Evaluate with a TorchScripted model') + io.add_argument('--torchscript_export', action='store_true', + help='Export the model with torch.jit to the output_dir') + return parser -def jit_export(audio, audio_len, audio_processor, encoderdecoder, greedy_decoder, args): +def durs_to_percentiles(durations, ratios): + durations = np.asarray(durations) * 1000 # in ms + latency = durations - print("##############") + latency = latency[5:] + mean_latency = np.mean(latency) - module_name = "{}_{}".format(os.path.basename(args.model_toml), "fp16" if args.amp else "fp32") - - if args.use_conv_mask: - module_name = module_name + "_noMaskConv" - - # Export just the featurizer - print("exporting featurizer ...") - traced_module_feat = torch.jit.script(audio_processor) - traced_module_feat.save(os.path.join(args.output_dir, module_name + "_feat.pt")) - - # Export just the acoustic model - print("exporting acoustic model ...") - inp_postFeat, _ = audio_processor(audio, audio_len) - traced_module_acoustic = torch.jit.trace(encoderdecoder, inp_postFeat) - traced_module_acoustic.save(os.path.join(args.output_dir, module_name + "_acoustic.pt")) - - # Export just the decoder - print("exporting decoder ...") - - inp_postAcoustic = encoderdecoder(inp_postFeat) - traced_module_decode = torch.jit.script(greedy_decoder, inp_postAcoustic) - traced_module_decode.save(os.path.join(args.output_dir, module_name + "_decoder.pt")) - print("JIT export complete") - - return traced_module_feat, traced_module_acoustic, traced_module_decode - -def run_once(audio_processor, encoderdecoder, greedy_decoder, audio, audio_len, labels, device): - features, lens = audio_processor(audio, audio_len) - if not device.type == 'cpu': - torch.cuda.synchronize() - t0 = time.perf_counter() - # TorchScripted model does not support (features, lengths) - if isinstance(encoderdecoder, torch.jit.TracedModule): - t_log_probs_e = encoderdecoder(features) - else: - t_log_probs_e, _ = encoderdecoder.infer((features, lens)) - if not device.type == 'cpu': - torch.cuda.synchronize() - t1 = time.perf_counter() - t_predictions_e = greedy_decoder(log_probs=t_log_probs_e) - hypotheses = __ctc_decoder_predictions_tensor(t_predictions_e, labels=labels) - print("INFERENCE TIME\t\t: {} ms".format((t1-t0)*1000.0)) - print("TRANSCRIPT\t\t:", hypotheses[0]) + latency_worst = nlargest(math.ceil((1 - min(ratios)) * len(latency)), latency) + latency_ranges = get_percentile(ratios, latency_worst, len(latency)) + latency_ranges[0.5] = mean_latency + return latency_ranges -def eval( - data_layer, - audio_processor, - encoderdecoder, - greedy_decoder, - labels, - multi_gpu, - device, - args): - """performs inference / evaluation - Args: - data_layer: data layer object that holds data loader - audio_processor: data processing module - encoderdecoder: acoustic model - greedy_decoder: greedy decoder - labels: list of labels as output vocabulary - multi_gpu: true if using multiple gpus - args: script input arguments - """ - logits_save_to=args.logits_save_to - - with torch.no_grad(): - if args.wav: - audio, audio_len = audio_from_file(args.wav) - run_once(audio_processor, encoderdecoder, greedy_decoder, audio, audio_len, labels, device) - if args.export_model: - jit_audio_processor, jit_encoderdecoder, jit_greedy_decoder = jit_export(audio, audio_len, audio_processor, encoderdecoder,greedy_decoder,args) - run_once(jit_audio_processor, jit_encoderdecoder, jit_greedy_decoder, audio, audio_len, labels, device) - return - wer, _global_var_dict = calc_wer(data_layer, audio_processor, encoderdecoder, greedy_decoder, labels, args, device) - if (not multi_gpu or (multi_gpu and torch.distributed.get_rank() == 0)): - print("==========>>>>>>Evaluation WER: {0}\n".format(wer)) - - if args.save_prediction is not None: - with open(args.save_prediction, 'w') as fp: - fp.write('\n'.join(_global_var_dict['predictions'])) - if logits_save_to is not None: - logits = [] - for batch in _global_var_dict["logits"]: - for i in range(batch.shape[0]): - logits.append(batch[i].cpu().numpy()) - with open(logits_save_to, 'wb') as f: - pickle.dump(logits, f, protocol=pickle.HIGHEST_PROTOCOL) - - # if args.export_model: - # feat, acoustic, decoder = jit_export(inp, audio_processor, encoderdecoder, greedy_decoder,args) - # wer_after = calc_wer(data_layer, feat, acoustic, decoder, labels, args) - # print("===>>>Before WER: {0}".format(wer)) - # print("===>>>Traced WER: {0}".format(wer_after)) - # print("===>>>Diff : {0} %".format((wer_after - wer_before) * 100.0 / wer_before)) - # print("") +def get_percentile(ratios, arr, nsamples): + res = {} + for a in ratios: + idx = max(int(nsamples * (1 - a)), 0) + res[a] = arr[idx] + return res -def main(args): - random.seed(args.seed) - np.random.seed(args.seed) - torch.manual_seed(args.seed) +def torchscript_export(data_loader, audio_processor, model, greedy_decoder, + output_dir, use_amp, use_conv_masks, model_config, device, + save): - multi_gpu = args.local_rank is not None + audio_processor.to(device) + + for batch in data_loader: + batch = [t.to(device, non_blocking=True) for t in batch] + audio, audio_len, _, _ = batch + feats, feat_lens = audio_processor(audio, audio_len) + break + + print("\nExporting featurizer...") + print("\nNOTE: Dithering causes warnings about non-determinism.\n") + ts_feat = torch.jit.trace(audio_processor, (audio, audio_len)) + + print("\nExporting acoustic model...") + model(feats, feat_lens) + ts_acoustic = torch.jit.trace(model, (feats, feat_lens)) + + print("\nExporting decoder...") + log_probs = model(feats, feat_lens) + ts_decoder = torch.jit.script(greedy_decoder, log_probs) + print("\nJIT export complete.") + + if save: + precision = "fp16" if use_amp else "fp32" + module_name = f'{os.path.basename(model_config)}_{precision}' + ts_feat.save(os.path.join(output_dir, module_name + "_feat.pt")) + ts_acoustic.save(os.path.join(output_dir, module_name + "_acoustic.pt")) + ts_decoder.save(os.path.join(output_dir, module_name + "_decoder.pt")) + + return ts_feat, ts_acoustic, ts_decoder + + +def main(): + + parser = get_parser() + args = parser.parse_args() + + log_fpath = args.log_file or str(Path(args.output_dir, 'nvlog_infer.json')) + log_fpath = unique_log_fpath(log_fpath) + dllogger.init(backends=[JSONStreamBackend(Verbosity.DEFAULT, log_fpath), + StdOutBackend(Verbosity.VERBOSE, + metric_format=stdout_metric_format)]) + + [dllogger.log("PARAMETER", {k: v}) for k, v in vars(args).items()] + + for step in ['DNN', 'data+DNN', 'data']: + for c in [0.99, 0.95, 0.9, 0.5]: + cs = 'avg' if c == 0.5 else f'{int(100*c)}%' + dllogger.metadata(f'{step.lower()}_latency_{c}', + {'name': f'{step} latency {cs}', + 'format': ':>7.2f', 'unit': 'ms'}) + dllogger.metadata( + 'eval_wer', {'name': 'WER', 'format': ':>3.2f', 'unit': '%'}) if args.cpu: - assert(not multi_gpu) device = torch.device('cpu') else: - assert(torch.cuda.is_available()) + assert torch.cuda.is_available() device = torch.device('cuda') torch.backends.cudnn.benchmark = args.cudnn_benchmark - print("CUDNN BENCHMARK ", args.cudnn_benchmark) - if multi_gpu: - print("DISTRIBUTED with ", torch.distributed.get_world_size()) - torch.cuda.set_device(args.local_rank) - torch.distributed.init_process_group(backend='nccl', init_method='env://') + if args.seed is not None: + torch.manual_seed(args.seed + args.local_rank) + np.random.seed(args.seed + args.local_rank) + random.seed(args.seed + args.local_rank) - optim_level = 3 if args.amp else 0 + # set up distributed training + multi_gpu = not args.cpu and int(os.environ.get('WORLD_SIZE', 1)) > 1 + if multi_gpu: + torch.cuda.set_device(args.local_rank) + distrib.init_process_group(backend='nccl', init_method='env://') + print_once(f'Inference with {distrib.get_world_size()} GPUs') - jasper_model_definition = toml.load(args.model_toml) - dataset_vocab = jasper_model_definition['labels']['labels'] - ctc_vocab = add_ctc_labels(dataset_vocab) - - val_manifest = args.val_manifest - featurizer_config = jasper_model_definition['input_eval'] - featurizer_config["optimization_level"] = optim_level - featurizer_config["fp16"] = args.amp - - args.use_conv_mask = jasper_model_definition['encoder'].get('convmask', True) - if args.use_conv_mask and args.export_model: - print('WARNING: Masked convs currently not supported for TorchScript. Disabling.') - jasper_model_definition['encoder']['convmask'] = False + cfg = config.load(args.model_config) if args.max_duration is not None: - featurizer_config['max_duration'] = args.max_duration - if args.pad_to is not None: - featurizer_config['pad_to'] = args.pad_to + cfg['input_val']['audio_dataset']['max_duration'] = args.max_duration + cfg['input_val']['filterbank_features']['max_duration'] = args.max_duration - if featurizer_config['pad_to'] == "max": - featurizer_config['pad_to'] = -1 + if args.pad_to_max_duration: + assert cfg['input_val']['audio_dataset']['max_duration'] > 0 + cfg['input_val']['audio_dataset']['pad_to_max_duration'] = True + cfg['input_val']['filterbank_features']['pad_to_max_duration'] = True - print('=== model_config ===') - print_dict(jasper_model_definition) - print() - print('=== feature_config ===') - print_dict(featurizer_config) - print() - data_layer = None + symbols = helpers.add_ctc_blank(cfg['labels']) - if args.wav is None: - data_layer = AudioToTextDataLayer( - dataset_dir=args.dataset_dir, - featurizer_config=featurizer_config, - manifest_filepath=val_manifest, - labels=dataset_vocab, + use_dali = args.dali_device in ('cpu', 'gpu') + dataset_kw, features_kw = config.input(cfg, 'val') + + measure_perf = args.steps > 0 + + # dataset + if args.transcribe_wav or args.transcribe_filelist: + assert not use_dali, "DALI is not supported for a single audio" + assert not args.transcribe_filelist + assert not args.pad_to_max_duration + assert not (args.transcribe_wav and args.transcribe_filelist) + + if args.transcribe_wav: + dataset = SingleAudioDataset(args.transcribe_wav) + else: + dataset = FilelistDataset(args.transcribe_filelist) + + data_loader = get_data_loader(dataset, + batch_size=1, + multi_gpu=multi_gpu, + shuffle=False, + num_workers=0, + drop_last=(True if measure_perf else False)) + + _, features_kw = config.input(cfg, 'val') + feat_proc = FilterbankFeatures(**features_kw) + + elif use_dali: + # pad_to_max_duration is not supported by DALI - have simple padders + if features_kw['pad_to_max_duration']: + feat_proc = BaseFeatures( + pad_align=features_kw['pad_align'], + pad_to_max_duration=True, + max_duration=features_kw['max_duration'], + sample_rate=features_kw['sample_rate'], + window_size=features_kw['window_size'], + window_stride=features_kw['window_stride']) + features_kw['pad_to_max_duration'] = False + else: + feat_proc = None + + data_loader = DaliDataLoader( + gpu_id=args.local_rank or 0, + dataset_path=args.dataset_dir, + config_data=dataset_kw, + config_features=features_kw, + json_names=args.val_manifests, batch_size=args.batch_size, - pad_to_max=featurizer_config['pad_to'] == -1, - shuffle=False, - multi_gpu=multi_gpu) - audio_preprocessor = AudioPreprocessing(**featurizer_config) - encoderdecoder = JasperEncoderDecoder(jasper_model_definition=jasper_model_definition, feat_in=1024, num_classes=len(ctc_vocab)) + pipeline_type=("train" if measure_perf else "val"), # no drop_last + device_type=args.dali_device, + symbols=symbols) + + else: + dataset = AudioDataset(args.dataset_dir, + args.val_manifests, + symbols, + **dataset_kw) + + data_loader = get_data_loader(dataset, + args.batch_size, + multi_gpu=multi_gpu, + shuffle=False, + num_workers=4, + drop_last=False) + + feat_proc = FilterbankFeatures(**features_kw) + + model = Jasper(encoder_kw=config.encoder(cfg), + decoder_kw=config.decoder(cfg, n_classes=len(symbols))) if args.ckpt is not None: - print("loading model from ", args.ckpt) + print(f'Loading the model from {args.ckpt} ...') + checkpoint = torch.load(args.ckpt, map_location="cpu") + key = 'ema_state_dict' if args.ema else 'state_dict' + state_dict = helpers.convert_v1_state_dict(checkpoint[key]) + model.load_state_dict(state_dict, strict=True) - if os.path.isdir(args.ckpt): - exit(0) - else: - checkpoint = torch.load(args.ckpt, map_location="cpu") - if args.ema and 'ema_state_dict' in checkpoint: - print('Loading EMA state dict') - sd = 'ema_state_dict' - else: - sd = 'state_dict' + model.to(device) + model.eval() - for k in audio_preprocessor.state_dict().keys(): - checkpoint[sd][k] = checkpoint[sd].pop("audio_preprocessor." + k) - audio_preprocessor.load_state_dict(checkpoint[sd], strict=False) - encoderdecoder.load_state_dict(checkpoint[sd], strict=False) - - greedy_decoder = GreedyCTCDecoder() - - # print("Number of parameters in encoder: {0}".format(model.jasper_encoder.num_weights())) - if args.wav is None: - N = len(data_layer) - step_per_epoch = math.ceil(N / (args.batch_size * (1 if not torch.distributed.is_initialized() else torch.distributed.get_world_size()))) - - if args.steps is not None: - print('-----------------') - print('Have {0} examples to eval on.'.format(args.steps * args.batch_size * (1 if not torch.distributed.is_initialized() else torch.distributed.get_world_size()))) - print('Have {0} steps / (gpu * epoch).'.format(args.steps)) - print('-----------------') - else: - print('-----------------') - print('Have {0} examples to eval on.'.format(N)) - print('Have {0} steps / (gpu * epoch).'.format(step_per_epoch)) - print('-----------------') - - print ("audio_preprocessor.normalize: ", audio_preprocessor.featurizer.normalize) - - audio_preprocessor.to(device) - encoderdecoder.to(device) + if feat_proc is not None: + feat_proc.to(device) + feat_proc.eval() if args.amp: - encoderdecoder = amp.initialize(models=encoderdecoder, - opt_level='O'+str(optim_level)) + model = model.half() - encoderdecoder = model_multi_gpu(encoderdecoder, multi_gpu) - audio_preprocessor.eval() - encoderdecoder.eval() - greedy_decoder.eval() + if args.torchscript: + greedy_decoder = GreedyCTCDecoder() - eval( - data_layer=data_layer, - audio_processor=audio_preprocessor, - encoderdecoder=encoderdecoder, - greedy_decoder=greedy_decoder, - labels=ctc_vocab, - args=args, - device=device, - multi_gpu=multi_gpu) + feat_proc, model, greedy_decoder = torchscript_export( + data_loader, feat_proc, model, greedy_decoder, args.output_dir, + use_amp=args.amp, use_conv_masks=True, model_toml=args.model_toml, + device=device, save=args.torchscript_export) -if __name__=="__main__": - args = parse_args() + if multi_gpu: + model = DistributedDataParallel(model) - print_dict(vars(args)) + agg = {'txts': [], 'preds': [], 'logits': []} + dur = {'data': [], 'dnn': [], 'data+dnn': []} - main(args) + looped_loader = chain.from_iterable(repeat(data_loader)) + greedy_decoder = GreedyCTCDecoder() + + sync = lambda: torch.cuda.synchronize() if device.type == 'cuda' else None + + steps = args.steps + args.warmup_steps or len(data_loader) + with torch.no_grad(): + + for it, batch in enumerate(tqdm(looped_loader, initial=1, total=steps)): + + if use_dali: + feats, feat_lens, txt, txt_lens = batch + if feat_proc is not None: + feats, feat_lens = feat_proc(feats, feat_lens) + else: + batch = [t.cuda(non_blocking=True) for t in batch] + audio, audio_lens, txt, txt_lens = batch + feats, feat_lens = feat_proc(audio, audio_lens) + + sync() + t1 = time.perf_counter() + + if args.amp: + feats = feats.half() + + if model.encoder.use_conv_masks: + log_probs, log_prob_lens = model(feats, feat_lens) + else: + log_probs = model(feats, feat_lens) + + preds = greedy_decoder(log_probs) + + sync() + t2 = time.perf_counter() + + # burn-in period; wait for a new loader due to num_workers + if it >= 1 and (args.steps == 0 or it >= args.warmup_steps): + dur['data'].append(t1 - t0) + dur['dnn'].append(t2 - t1) + dur['data+dnn'].append(t2 - t0) + + if txt is not None: + agg['txts'] += helpers.gather_transcripts([txt], [txt_lens], + symbols) + agg['preds'] += helpers.gather_predictions([preds], symbols) + agg['logits'].append(log_probs) + + if it + 1 == steps: + break + + sync() + t0 = time.perf_counter() + + # communicate the results + if args.transcribe_wav: + for idx, p in enumerate(agg['preds']): + print_once(f'Prediction {idx+1: >3}: {p}') + + elif args.transcribe_filelist: + pass + + elif not multi_gpu or distrib.get_rank() == 0: + wer, _ = process_evaluation_epoch(agg) + + dllogger.log(step=(), data={'eval_wer': 100 * wer}) + + if args.save_predictions: + with open(args.save_predictions, 'w') as f: + f.write('\n'.join(agg['preds'])) + + if args.save_logits: + logits = torch.cat(agg['logits'], dim=0).cpu() + torch.save(logits, args.save_logits) + + # report timings + if len(dur['data']) >= 20: + ratios = [0.9, 0.95, 0.99] + for stage in dur: + lat = durs_to_percentiles(dur[stage], ratios) + for k in [0.99, 0.95, 0.9, 0.5]: + kk = str(k).replace('.', '_') + dllogger.log(step=(), data={f'{stage.lower()}_latency_{kk}': lat[k]}) + + else: + print_once('Not enough samples to measure latencies.') + + +if __name__ == "__main__": + main() diff --git a/PyTorch/SpeechRecognition/Jasper/inference_benchmark.py b/PyTorch/SpeechRecognition/Jasper/inference_benchmark.py deleted file mode 100644 index c0be36cd..00000000 --- a/PyTorch/SpeechRecognition/Jasper/inference_benchmark.py +++ /dev/null @@ -1,301 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import argparse -import itertools -import os -import sys -import time -import random -import numpy as np -from heapq import nlargest -import math -from tqdm import tqdm -import toml -import torch -from apex import amp -from dataset import AudioToTextDataLayer -from helpers import process_evaluation_batch, process_evaluation_epoch, add_ctc_labels, print_dict -from model import AudioPreprocessing, GreedyCTCDecoder, JasperEncoderDecoder -from parts.features import audio_from_file - -def parse_args(): - parser = argparse.ArgumentParser(description='Jasper') - parser.add_argument("--steps", default=None, help='if not specified do evaluation on full dataset. otherwise only evaluates the specified number of iterations for each worker', type=int) - parser.add_argument("--batch_size", default=16, type=int, help='data batch size') - parser.add_argument("--max_duration", default=None, type=float, help='maximum duration of sequences. if None uses attribute from model configuration file') - parser.add_argument("--pad_to", default=None, type=int, help="default is pad to value as specified in model configurations. if -1 pad to maximum duration. If > 0 pad batch to next multiple of value") - parser.add_argument("--model_toml", type=str, help='relative model configuration path given dataset folder') - parser.add_argument("--dataset_dir", type=str, help='absolute path to dataset folder') - parser.add_argument("--val_manifest", type=str, help='relative path to evaluation dataset manifest file') - parser.add_argument("--cudnn_benchmark", action='store_true', help="enable cudnn benchmark") - parser.add_argument("--ckpt", default=None, type=str, required=True, help='path to model checkpoint') - parser.add_argument("--amp", "--fp16", action='store_true', help='use half precision') - parser.add_argument("--seed", default=42, type=int, help='seed') - parser.add_argument("--cpu", action='store_true', help='run inference on CPU') - parser.add_argument("--torch_script", action='store_true', help='export model') - parser.add_argument("--sample_audio", default="/datasets/LibriSpeech/dev-clean-wav/1272/128104/1272-128104-0000.wav", type=str, help='audio sample path for torchscript, points to one of the files in /datasets/LibriSpeech/dev-clean-wav/ if not defined') - return parser.parse_args() - -def jit_export( - audio, - audio_len, - audio_processor, - encoderdecoder, - greedy_decoder, - args): - """applies torchscript - Args: - audio: - audio_len: - audio_processor: data processing module - encoderdecoder: acoustic model - greedy_decoder: greedy decoder - args: script input arguments - """ - # Export just the featurizer - print("torchscripting featurizer ...") - traced_module_feat = torch.jit.script(audio_processor) - - # Export just the acoustic model - print("torchscripting acoustic model ...") - inp_postFeat, _ = audio_processor(audio, audio_len) - traced_module_acoustic = torch.jit.trace(encoderdecoder, inp_postFeat) - - # Export just the decoder - print("torchscripting decoder ...") - inp_postAcoustic = encoderdecoder(inp_postFeat) - traced_module_decode = torch.jit.script(greedy_decoder, inp_postAcoustic) - print("JIT process complete") - - return traced_module_feat, traced_module_acoustic, traced_module_decode - -def eval( - data_layer, - audio_processor, - encoderdecoder, - greedy_decoder, - labels, - device, - args): - """performs evaluation and prints performance statistics - Args: - data_layer: data layer object that holds data loader - audio_processor: data processing module - encoderdecoder: acoustic model - greedy_decoder: greedy decoder - labels: list of labels as output vocabulary - args: script input arguments - """ - batch_size=args.batch_size - steps=args.steps - audio_processor.eval() - encoderdecoder.eval() - greedy_decoder.eval() - - if args.torch_script: - audio, audio_len = audio_from_file(args.sample_audio, device=device) - audio_processor, encoderdecoder, greedy_decoder = jit_export(audio, audio_len, audio_processor, encoderdecoder, greedy_decoder, args) - - with torch.no_grad(): - _global_var_dict = { - 'predictions': [], - 'transcripts': [], - } - - it = 0 - ep = 0 - - if steps is None: - steps = math.ceil(len(data_layer) / batch_size) - durations_dnn = [] - durations_dnn_and_prep = [] - seq_lens = [] - - sync = lambda: torch.cuda.synchronize() if device.type == 'cuda' else None - - while True: - ep += 1 - for data in tqdm(data_layer.data_iterator): - it += 1 - if it > steps: - break - tensors = [t.to(device) for t in data] - - t_audio_signal_e, t_a_sig_length_e, t_transcript_e, t_transcript_len_e = tensors - - sync() - t0 = time.perf_counter() - features, lens = audio_processor(t_audio_signal_e, t_a_sig_length_e) - - sync() - t1 = time.perf_counter() - if isinstance(encoderdecoder, torch.jit.TracedModule): - t_log_probs_e = encoderdecoder(features) - else: - t_log_probs_e, _ = encoderdecoder.infer((features, lens)) - - sync() - stop_time = time.perf_counter() - time_prep_and_dnn = stop_time - t0 - time_dnn = stop_time - t1 - t_predictions_e = greedy_decoder(log_probs=t_log_probs_e) - - values_dict = dict( - predictions=[t_predictions_e], - transcript=[t_transcript_e], - transcript_length=[t_transcript_len_e], - ) - process_evaluation_batch(values_dict, _global_var_dict, labels=labels) - durations_dnn.append(time_dnn) - durations_dnn_and_prep.append(time_prep_and_dnn) - seq_lens.append(features[0].shape[-1]) - - if it >= steps: - - wer, _ = process_evaluation_epoch(_global_var_dict) - print("==========>>>>>>Evaluation of all iterations WER: {0}\n".format(wer)) - break - - ratios = [0.9, 0.95,0.99, 1.] - latencies_dnn = take_durations_and_output_percentile(durations_dnn, ratios) - latencies_dnn_and_prep = take_durations_and_output_percentile(durations_dnn_and_prep, ratios) - print("\n using batch size {} and {} frames ".format(batch_size, seq_lens[-1])) - print("\n".join(["dnn latency {} : {} ".format(k, v) for k, v in latencies_dnn.items()])) - print("\n".join(["prep + dnn latency {} : {} ".format(k, v) for k, v in latencies_dnn_and_prep.items()])) - -def take_durations_and_output_percentile(durations, ratios): - durations = np.asarray(durations) * 1000 # in ms - latency = durations - - latency = latency[5:] - mean_latency = np.mean(latency) - - latency_worst = nlargest(math.ceil( (1 - min(ratios))* len(latency)), latency) - latency_ranges=get_percentile(ratios, latency_worst, len(latency)) - latency_ranges["0.5"] = mean_latency - return latency_ranges - -def get_percentile(ratios, arr, nsamples): - res = {} - for a in ratios: - idx = max(int(nsamples * (1 - a)), 0) - res[a] = arr[idx] - return res - -def main(args): - random.seed(args.seed) - np.random.seed(args.seed) - torch.manual_seed(args.seed) - assert(args.steps is None or args.steps > 5) - - if args.cpu: - device = torch.device('cpu') - else: - assert(torch.cuda.is_available()) - device = torch.device('cuda') - torch.backends.cudnn.benchmark = args.cudnn_benchmark - print("CUDNN BENCHMARK ", args.cudnn_benchmark) - - optim_level = 3 if args.amp else 0 - batch_size = args.batch_size - - jasper_model_definition = toml.load(args.model_toml) - dataset_vocab = jasper_model_definition['labels']['labels'] - ctc_vocab = add_ctc_labels(dataset_vocab) - - val_manifest = args.val_manifest - featurizer_config = jasper_model_definition['input_eval'] - featurizer_config["optimization_level"] = optim_level - - if args.max_duration is not None: - featurizer_config['max_duration'] = args.max_duration - - # TORCHSCRIPT: Cant use mixed types. Using -1 for "max" - if args.pad_to is not None: - featurizer_config['pad_to'] = args.pad_to if args.pad_to >= 0 else -1 - - if featurizer_config['pad_to'] == "max": - featurizer_config['pad_to'] = -1 - - args.use_conv_mask = jasper_model_definition['encoder'].get('convmask', True) - if args.use_conv_mask and args.torch_script: - print('WARNING: Masked convs currently not supported for TorchScript. Disabling.') - jasper_model_definition['encoder']['convmask'] = False - - print('model_config') - print_dict(jasper_model_definition) - print('feature_config') - print_dict(featurizer_config) - - data_layer = AudioToTextDataLayer( - dataset_dir=args.dataset_dir, - featurizer_config=featurizer_config, - manifest_filepath=val_manifest, - labels=dataset_vocab, - batch_size=batch_size, - pad_to_max=featurizer_config['pad_to'] == -1, - shuffle=False, - multi_gpu=False) - - audio_preprocessor = AudioPreprocessing(**featurizer_config) - - encoderdecoder = JasperEncoderDecoder(jasper_model_definition=jasper_model_definition, feat_in=1024, num_classes=len(ctc_vocab)) - - if args.ckpt is not None: - print("loading model from ", args.ckpt) - checkpoint = torch.load(args.ckpt, map_location="cpu") - for k in audio_preprocessor.state_dict().keys(): - checkpoint['state_dict'][k] = checkpoint['state_dict'].pop("audio_preprocessor." + k) - audio_preprocessor.load_state_dict(checkpoint['state_dict'], strict=False) - encoderdecoder.load_state_dict(checkpoint['state_dict'], strict=False) - - greedy_decoder = GreedyCTCDecoder() - - # print("Number of parameters in encoder: {0}".format(model.jasper_encoder.num_weights())) - - N = len(data_layer) - step_per_epoch = math.ceil(N / args.batch_size) - - print('-----------------') - if args.steps is None: - print('Have {0} examples to eval on.'.format(N)) - print('Have {0} steps / (epoch).'.format(step_per_epoch)) - else: - print('Have {0} examples to eval on.'.format(args.steps * args.batch_size)) - print('Have {0} steps / (epoch).'.format(args.steps)) - print('-----------------') - - audio_preprocessor.to(device) - encoderdecoder.to(device) - - if args.amp: - encoderdecoder = amp.initialize( - models=encoderdecoder, opt_level='O'+str(optim_level)) - - eval( - data_layer=data_layer, - audio_processor=audio_preprocessor, - encoderdecoder=encoderdecoder, - greedy_decoder=greedy_decoder, - labels=ctc_vocab, - device=device, - args=args) - -if __name__=="__main__": - args = parse_args() - - print_dict(vars(args)) - - main(args) diff --git a/PyTorch/SpeechRecognition/Jasper/jasper/config.py b/PyTorch/SpeechRecognition/Jasper/jasper/config.py new file mode 100644 index 00000000..e283b6d8 --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/jasper/config.py @@ -0,0 +1,110 @@ +import copy +import inspect +import yaml + +from .model import JasperDecoderForCTC, JasperBlock, JasperEncoder +from common.audio import GainPerturbation, ShiftPerturbation, SpeedPerturbation +from common.dataset import AudioDataset +from common.features import CutoutAugment, FilterbankFeatures, SpecAugment +from common.helpers import print_once + + +def default_args(klass): + sig = inspect.signature(klass.__init__) + return {k: v.default for k,v in sig.parameters.items() if k != 'self'} + + +def load(fpath): + if fpath.endswith('.toml'): + raise ValueError('.toml config format has been changed to .yaml') + + cfg = yaml.safe_load(open(fpath, 'r')) + + # Reload to deep copy shallow copies, which were made with yaml anchors + yaml.Dumper.ignore_aliases = lambda *args: True + cfg = yaml.dump(cfg) + cfg = yaml.safe_load(cfg) + return cfg + + +def validate_and_fill(klass, user_conf, ignore_unk=[], optional=[]): + conf = default_args(klass) + + for k,v in user_conf.items(): + assert k in conf or k in ignore_unk, f'Unknown parameter {k} for {klass}' + conf[k] = v + + # Keep only mandatory or optional-nonempty + conf = {k:v for k,v in conf.items() + if k not in optional or v is not inspect.Parameter.empty} + + # Validate + for k,v in conf.items(): + assert v is not inspect.Parameter.empty, \ + f'Value for {k} not specified for {klass}' + return conf + + +def input(conf_yaml, split='train'): + conf = copy.deepcopy(conf_yaml[f'input_{split}']) + conf_dataset = conf.pop('audio_dataset') + conf_features = conf.pop('filterbank_features') + + # Validate known inner classes + inner_classes = [ + (conf_dataset, 'speed_perturbation', SpeedPerturbation), + (conf_dataset, 'gain_perturbation', GainPerturbation), + (conf_dataset, 'shift_perturbation', ShiftPerturbation), + (conf_features, 'spec_augment', SpecAugment), + (conf_features, 'cutout_augment', CutoutAugment), + ] + for conf_tgt, key, klass in inner_classes: + if key in conf_tgt: + conf_tgt[key] = validate_and_fill(klass, conf_tgt[key]) + + for k in conf: + raise ValueError(f'Unknown key {k}') + + # Validate outer classes + conf_dataset = validate_and_fill( + AudioDataset, conf_dataset, + optional=['data_dir', 'labels', 'manifest_fpaths']) + + conf_features = validate_and_fill( + FilterbankFeatures, conf_features) + + # Check params shared between classes + shared = ['sample_rate', 'max_duration', 'pad_to_max_duration'] + for sh in shared: + assert conf_dataset[sh] == conf_features[sh], ( + f'{sh} should match in Dataset and FeatureProcessor: ' + f'{conf_dataset[sh]}, {conf_features[sh]}') + + return conf_dataset, conf_features + + +def encoder(conf): + """Validate config for JasperEncoder and subsequent JasperBlocks""" + + # Validate, but don't overwrite with defaults + for blk in conf['jasper']['encoder']['blocks']: + validate_and_fill(JasperBlock, blk, optional=['infilters'], + ignore_unk=['residual_dense']) + + return validate_and_fill(JasperEncoder, conf['jasper']['encoder']) + + +def decoder(conf, n_classes): + decoder_kw = {'n_classes': n_classes, **conf['jasper']['decoder']} + return validate_and_fill(JasperDecoderForCTC, decoder_kw) + + +def apply_duration_flags(cfg, max_duration, pad_to_max_duration): + if max_duration is not None: + cfg['input_train']['audio_dataset']['max_duration'] = max_duration + cfg['input_train']['filterbank_features']['max_duration'] = max_duration + + if pad_to_max_duration: + assert cfg['input_train']['audio_dataset']['max_duration'] > 0 + cfg['input_train']['audio_dataset']['pad_to_max_duration'] = True + cfg['input_train']['filterbank_features']['pad_to_max_duration'] = True diff --git a/PyTorch/SpeechRecognition/Jasper/jasper/model.py b/PyTorch/SpeechRecognition/Jasper/jasper/model.py new file mode 100644 index 00000000..dd38ce4b --- /dev/null +++ b/PyTorch/SpeechRecognition/Jasper/jasper/model.py @@ -0,0 +1,275 @@ +# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +import torch.nn as nn +import torch.nn.functional as F + + +activations = { + "hardtanh": nn.Hardtanh, + "relu": nn.ReLU, + "selu": nn.SELU, +} + + +def init_weights(m, mode='xavier_uniform'): + if type(m) == nn.Conv1d or type(m) == MaskedConv1d: + if mode == 'xavier_uniform': + nn.init.xavier_uniform_(m.weight, gain=1.0) + elif mode == 'xavier_normal': + nn.init.xavier_normal_(m.weight, gain=1.0) + elif mode == 'kaiming_uniform': + nn.init.kaiming_uniform_(m.weight, nonlinearity="relu") + elif mode == 'kaiming_normal': + nn.init.kaiming_normal_(m.weight, nonlinearity="relu") + else: + raise ValueError("Unknown Initialization mode: {0}".format(mode)) + + elif type(m) == nn.BatchNorm1d: + if m.track_running_stats: + m.running_mean.zero_() + m.running_var.fill_(1) + m.num_batches_tracked.zero_() + if m.affine: + nn.init.ones_(m.weight) + nn.init.zeros_(m.bias) + + +def get_same_padding(kernel_size, stride, dilation): + if stride > 1 and dilation > 1: + raise ValueError("Only stride OR dilation may be greater than 1") + return (kernel_size // 2) * dilation + + +class MaskedConv1d(nn.Conv1d): + """1D convolution with sequence masking + """ + __constants__ = ["masked"] + def __init__(self, in_channels, out_channels, kernel_size, stride=1, + padding=0, dilation=1, groups=1, bias=False, masked=True): + super(MaskedConv1d, self).__init__( + in_channels, out_channels, kernel_size, stride=stride, + padding=padding, dilation=dilation, groups=groups, bias=bias) + + self.masked = masked + + def get_seq_len(self, lens): + return ((lens + 2 * self.padding[0] - self.dilation[0] + * (self.kernel_size[0] - 1) - 1) // self.stride[0] + 1) + + def forward(self, x, x_lens=None): + if self.masked: + max_len = x.size(2) + idxs = torch.arange(max_len, dtype=x_lens.dtype, device=x_lens.device) + mask = idxs.expand(x_lens.size(0), max_len) >= x_lens.unsqueeze(1) + x = x.masked_fill(mask.unsqueeze(1).to(device=x.device), 0) + x_lens = self.get_seq_len(x_lens) + + return super(MaskedConv1d, self).forward(x), x_lens + + +class JasperBlock(nn.Module): + __constants__ = ["use_conv_masks"] + + """Jasper Block. See https://arxiv.org/pdf/1904.03288.pdf + """ + def __init__(self, infilters, filters, repeat=3, kernel_size=11, stride=1, + dilation=1, padding='same', dropout=0.2, activation=None, + residual=True, residual_panes=[], use_conv_masks=False): + super(JasperBlock, self).__init__() + + assert padding == "same", "Only 'same' padding is supported." + + padding_val = get_same_padding(kernel_size[0], stride[0], dilation[0]) + self.use_conv_masks = use_conv_masks + self.conv = nn.ModuleList() + for i in range(repeat): + self.conv.extend(self._conv_bn(infilters if i == 0 else filters, + filters, + kernel_size=kernel_size, + stride=stride, + dilation=dilation, + padding=padding_val)) + if i < repeat - 1: + self.conv.extend(self._act_dropout(dropout, activation)) + + self.res = nn.ModuleList() if residual else None + res_panes = residual_panes.copy() + self.dense_residual = residual + + if residual: + if len(residual_panes) == 0: + res_panes = [infilters] + self.dense_residual = False + + for ip in res_panes: + self.res.append(nn.ModuleList( + self._conv_bn(ip, filters, kernel_size=1))) + + self.out = nn.Sequential(*self._act_dropout(dropout, activation)) + + def _conv_bn(self, in_channels, out_channels, **kw): + return [MaskedConv1d(in_channels, out_channels, + masked=self.use_conv_masks, **kw), + nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.1)] + + def _act_dropout(self, dropout=0.2, activation=None): + return [activation or nn.Hardtanh(min_val=0.0, max_val=20.0), + nn.Dropout(p=dropout)] + + def forward(self, xs, xs_lens=None): + if not self.use_conv_masks: + xs_lens = 0 + + # forward convolutions + out = xs[-1] + lens = xs_lens + for i, l in enumerate(self.conv): + if isinstance(l, MaskedConv1d): + out, lens = l(out, lens) + else: + out = l(out) + + # residuals + if self.res is not None: + for i, layer in enumerate(self.res): + res_out = xs[i] + for j, res_layer in enumerate(layer): + if j == 0: # and self.use_conv_mask: + res_out, _ = res_layer(res_out, xs_lens) + else: + res_out = res_layer(res_out) + out += res_out + + # output + out = self.out(out) + if self.res is not None and self.dense_residual: + out = xs + [out] + else: + out = [out] + + if self.use_conv_masks: + return out, lens + else: + return out, None + + +class JasperEncoder(nn.Module): + __constants__ = ["use_conv_masks"] + + def __init__(self, in_feats, activation, frame_splicing=1, + init='xavier_uniform', use_conv_masks=False, blocks=[]): + super(JasperEncoder, self).__init__() + + self.use_conv_masks = use_conv_masks + self.layers = nn.ModuleList() + + in_feats *= frame_splicing + all_residual_panes = [] + for i,blk in enumerate(blocks): + + blk['activation'] = activations[activation]() + + has_residual_dense = blk.pop('residual_dense', False) + if has_residual_dense: + all_residual_panes += [in_feats] + blk['residual_panes'] = all_residual_panes + else: + blk['residual_panes'] = [] + + self.layers.append( + JasperBlock(in_feats, use_conv_masks=use_conv_masks, **blk)) + + in_feats = blk['filters'] + + self.apply(lambda x: init_weights(x, mode=init)) + + def forward(self, x, x_lens=None): + out, out_lens = [x], x_lens + for l in self.layers: + out, out_lens = l(out, out_lens) + + return out, out_lens + + +class JasperDecoderForCTC(nn.Module): + def __init__(self, in_feats, n_classes, init='xavier_uniform'): + super(JasperDecoderForCTC, self).__init__() + + self.layers = nn.Sequential( + nn.Conv1d(in_feats, n_classes, kernel_size=1, bias=True),) + self.apply(lambda x: init_weights(x, mode=init)) + + def forward(self, enc_out): + out = self.layers(enc_out[-1]).transpose(1, 2) + return F.log_softmax(out, dim=2) + + +class GreedyCTCDecoder(nn.Module): + @torch.no_grad() + def forward(self, log_probs, log_prob_lens=None): + + if log_prob_lens is not None: + max_len = log_probs.size(1) + idxs = torch.arange(max_len, dtype=log_prob_lens.dtype, + device=log_prob_lens.device) + mask = idxs.unsqueeze(0) >= log_prob_lens.unsqueeze(1) + log_probs[:,:,-1] = log_probs[:,:,-1].masked_fill(mask, float("Inf")) + + return log_probs.argmax(dim=-1, keepdim=False).int() + + +class Jasper(nn.Module): + def __init__(self, encoder_kw, decoder_kw, transpose_in=False): + super(Jasper, self).__init__() + self.transpose_in = transpose_in + self.encoder = JasperEncoder(**encoder_kw) + self.decoder = JasperDecoderForCTC(**decoder_kw) + + def forward(self, x, x_lens=None): + if self.encoder.use_conv_masks: + assert x_lens is not None + enc, enc_lens = self.encoder(x, x_lens) + out = self.decoder(enc) + return out, enc_lens + else: + if self.transpose_in: + x = x.transpose(1, 2) + enc, _ = self.encoder(x) + out = self.decoder(enc) + return out # torchscript refuses to output None + + # TODO Explicitly add x_lens=None for inference (now x can be a Tensor or tuple) + def infer(self, x, x_lens=None): + if self.encoder.use_conv_masks: + return self.forward(x, x_lens) + else: + ret = self.forward(x) + return ret, len(ret) + + +class CTCLossNM: + def __init__(self, n_classes): + self._criterion = nn.CTCLoss(blank=n_classes-1, reduction='none') + + def __call__(self, log_probs, targets, input_length, target_length): + input_length = input_length.long() + target_length = target_length.long() + targets = targets.long() + loss = self._criterion(log_probs.transpose(1, 0), targets, input_length, + target_length) + # note that this is different from reduction = 'mean' + # because we are not dividing by target lengths + return torch.mean(loss) diff --git a/PyTorch/SpeechRecognition/Jasper/model.py b/PyTorch/SpeechRecognition/Jasper/model.py deleted file mode 100644 index b594e43a..00000000 --- a/PyTorch/SpeechRecognition/Jasper/model.py +++ /dev/null @@ -1,423 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from apex import amp -import torch -import torch.nn as nn -from parts.features import FeatureFactory -import random - - -jasper_activations = { - "hardtanh": nn.Hardtanh, - "relu": nn.ReLU, - "selu": nn.SELU, -} - -def init_weights(m, mode='xavier_uniform'): - if type(m) == nn.Conv1d or type(m) == MaskedConv1d: - if mode == 'xavier_uniform': - nn.init.xavier_uniform_(m.weight, gain=1.0) - elif mode == 'xavier_normal': - nn.init.xavier_normal_(m.weight, gain=1.0) - elif mode == 'kaiming_uniform': - nn.init.kaiming_uniform_(m.weight, nonlinearity="relu") - elif mode == 'kaiming_normal': - nn.init.kaiming_normal_(m.weight, nonlinearity="relu") - else: - raise ValueError("Unknown Initialization mode: {0}".format(mode)) - elif type(m) == nn.BatchNorm1d: - if m.track_running_stats: - m.running_mean.zero_() - m.running_var.fill_(1) - m.num_batches_tracked.zero_() - if m.affine: - nn.init.ones_(m.weight) - nn.init.zeros_(m.bias) - -def get_same_padding(kernel_size, stride, dilation): - if stride > 1 and dilation > 1: - raise ValueError("Only stride OR dilation may be greater than 1") - return (kernel_size // 2) * dilation - -class AudioPreprocessing(nn.Module): - """GPU accelerated audio preprocessing - """ - __constants__ = ["optim_level"] - def __init__(self, **kwargs): - nn.Module.__init__(self) # For PyTorch API - self.optim_level = kwargs.get('optimization_level', 0) - self.featurizer = FeatureFactory.from_config(kwargs) - self.transpose_out = kwargs.get("transpose_out", False) - - @torch.no_grad() - def forward(self, input_signal, length): - processed_signal = self.featurizer(input_signal, length) - processed_length = self.featurizer.get_seq_len(length) - if self.transpose_out: - processed_signal.transpose_(2,1) - return processed_signal, processed_length - else: - return processed_signal, processed_length - -class SpectrogramAugmentation(nn.Module): - """Spectrogram augmentation - """ - def __init__(self, **kwargs): - nn.Module.__init__(self) - self.spec_cutout_regions = SpecCutoutRegions(kwargs) - self.spec_augment = SpecAugment(kwargs) - - @torch.no_grad() - def forward(self, input_spec): - augmented_spec = self.spec_cutout_regions(input_spec) - augmented_spec = self.spec_augment(augmented_spec) - return augmented_spec - -class SpecAugment(nn.Module): - """Spec augment. refer to https://arxiv.org/abs/1904.08779 - """ - def __init__(self, cfg): - super(SpecAugment, self).__init__() - self.cutout_x_regions = cfg.get('cutout_x_regions', 0) - self.cutout_y_regions = cfg.get('cutout_y_regions', 0) - - self.cutout_x_width = cfg.get('cutout_x_width', 10) - self.cutout_y_width = cfg.get('cutout_y_width', 10) - - @torch.no_grad() - def forward(self, x): - sh = x.shape - - mask = torch.zeros(x.shape, dtype=torch.bool) - for idx in range(sh[0]): - for _ in range(self.cutout_x_regions): - cutout_x_left = int(random.uniform(0, sh[1] - self.cutout_x_width)) - - mask[idx, cutout_x_left:cutout_x_left + self.cutout_x_width, :] = 1 - - for _ in range(self.cutout_y_regions): - cutout_y_left = int(random.uniform(0, sh[2] - self.cutout_y_width)) - - mask[idx, :, cutout_y_left:cutout_y_left + self.cutout_y_width] = 1 - - x = x.masked_fill(mask.to(device=x.device), 0) - - return x - -class SpecCutoutRegions(nn.Module): - """Cutout. refer to https://arxiv.org/pdf/1708.04552.pdf - """ - def __init__(self, cfg): - super(SpecCutoutRegions, self).__init__() - - self.cutout_rect_regions = cfg.get('cutout_rect_regions', 0) - self.cutout_rect_time = cfg.get('cutout_rect_time', 5) - self.cutout_rect_freq = cfg.get('cutout_rect_freq', 20) - - @torch.no_grad() - def forward(self, x): - sh = x.shape - - mask = torch.zeros(x.shape, dtype=torch.bool) - - for idx in range(sh[0]): - for i in range(self.cutout_rect_regions): - cutout_rect_x = int(random.uniform( - 0, sh[1] - self.cutout_rect_freq)) - cutout_rect_y = int(random.uniform( - 0, sh[2] - self.cutout_rect_time)) - - mask[idx, cutout_rect_x:cutout_rect_x + self.cutout_rect_freq, - cutout_rect_y:cutout_rect_y + self.cutout_rect_time] = 1 - - x = x.masked_fill(mask.to(device=x.device), 0) - - return x - -class JasperEncoder(nn.Module): - __constants__ = ["use_conv_mask"] - """Jasper encoder - """ - def __init__(self, **kwargs): - cfg = {} - for key, value in kwargs.items(): - cfg[key] = value - - nn.Module.__init__(self) - self._cfg = cfg - - activation = jasper_activations[cfg['encoder']['activation']]() - self.use_conv_mask = cfg['encoder'].get('convmask', False) - feat_in = cfg['input']['features'] * cfg['input'].get('frame_splicing', 1) - init_mode = cfg.get('init_mode', 'xavier_uniform') - - residual_panes = [] - encoder_layers = [] - self.dense_residual = False - for lcfg in cfg['jasper']: - dense_res = [] - if lcfg.get('residual_dense', False): - residual_panes.append(feat_in) - dense_res = residual_panes - self.dense_residual = True - encoder_layers.append( - JasperBlock(feat_in, lcfg['filters'], repeat=lcfg['repeat'], - kernel_size=lcfg['kernel'], stride=lcfg['stride'], - dilation=lcfg['dilation'], dropout=lcfg['dropout'], - residual=lcfg['residual'], activation=activation, - residual_panes=dense_res, use_conv_mask=self.use_conv_mask)) - feat_in = lcfg['filters'] - - self.encoder = nn.Sequential(*encoder_layers) - self.apply(lambda x: init_weights(x, mode=init_mode)) - - def num_weights(self): - return sum(p.numel() for p in self.parameters() if p.requires_grad) - - def forward(self, x): - if self.use_conv_mask: - audio_signal, length = x - return self.encoder(([audio_signal], length)) - else: - return self.encoder([x]) - -class JasperDecoderForCTC(nn.Module): - """Jasper decoder - """ - def __init__(self, **kwargs): - nn.Module.__init__(self) - self._feat_in = kwargs.get("feat_in") - self._num_classes = kwargs.get("num_classes") - init_mode = kwargs.get('init_mode', 'xavier_uniform') - - self.decoder_layers = nn.Sequential( - nn.Conv1d(self._feat_in, self._num_classes, kernel_size=1, bias=True),) - self.apply(lambda x: init_weights(x, mode=init_mode)) - - def num_weights(self): - return sum(p.numel() for p in self.parameters() if p.requires_grad) - - def forward(self, encoder_output): - out = self.decoder_layers(encoder_output[-1]).transpose(1, 2) - return nn.functional.log_softmax(out, dim=2) - -class JasperEncoderDecoder(nn.Module): - """Contains jasper encoder and decoder - """ - def __init__(self, **kwargs): - nn.Module.__init__(self) - self.transpose_in=kwargs.get("transpose_in", False) - self.jasper_encoder = JasperEncoder(**kwargs.get("jasper_model_definition")) - self.jasper_decoder = JasperDecoderForCTC(feat_in=kwargs.get("feat_in"), - num_classes=kwargs.get("num_classes")) - - def num_weights(self): - return sum(p.numel() for p in self.parameters() if p.requires_grad) - - - def forward(self, x): - if self.jasper_encoder.use_conv_mask: - t_encoded_t, t_encoded_len_t = self.jasper_encoder(x) - else: - if self.transpose_in: - x = x.transpose(1, 2) - t_encoded_t = self.jasper_encoder(x) - - out = self.jasper_decoder(t_encoded_t) - if self.jasper_encoder.use_conv_mask: - return out, t_encoded_len_t - else: - return out - - def infer(self, x): - if self.jasper_encoder.use_conv_mask: - return self.forward(x) - else: - ret = self.forward(x[0]) - return ret, len(ret) - - -class Jasper(JasperEncoderDecoder): - """Contains data preprocessing, spectrogram augmentation, jasper encoder and decoder - """ - def __init__(self, **kwargs): - JasperEncoderDecoder.__init__(self, **kwargs) - feature_config = kwargs.get("feature_config") - if self.transpose_in: - feature_config["transpose"] = True - self.audio_preprocessor = AudioPreprocessing(**feature_config) - self.data_spectr_augmentation = SpectrogramAugmentation(**feature_config) - - -class MaskedConv1d(nn.Conv1d): - """1D convolution with sequence masking - """ - __constants__ = ["use_conv_mask"] - def __init__(self, in_channels, out_channels, kernel_size, stride=1, - padding=0, dilation=1, groups=1, bias=False, use_conv_mask=True): - super(MaskedConv1d, self).__init__(in_channels, out_channels, kernel_size, - stride=stride, - padding=padding, dilation=dilation, - groups=groups, bias=bias) - self.use_conv_mask = use_conv_mask - - def get_seq_len(self, lens): - return ((lens + 2 * self.padding[0] - self.dilation[0] * ( - self.kernel_size[0] - 1) - 1) // self.stride[0] + 1) - - def forward(self, inp): - if self.use_conv_mask: - x, lens = inp - max_len = x.size(2) - idxs = torch.arange(max_len).to(lens.dtype).to(lens.device).expand(len(lens), max_len) - mask = idxs >= lens.unsqueeze(1) - x = x.masked_fill(mask.unsqueeze(1).to(device=x.device), 0) - del mask - del idxs - lens = self.get_seq_len(lens) - return super(MaskedConv1d, self).forward(x), lens - else: - return super(MaskedConv1d, self).forward(inp) - - -class JasperBlock(nn.Module): - __constants__ = ["use_conv_mask", "conv"] - - """Jasper Block. See https://arxiv.org/pdf/1904.03288.pdf - """ - def __init__(self, inplanes, planes, repeat=3, kernel_size=11, stride=1, - dilation=1, padding='same', dropout=0.2, activation=None, - residual=True, residual_panes=[], use_conv_mask=False): - super(JasperBlock, self).__init__() - - if padding != "same": - raise ValueError("currently only 'same' padding is supported") - - - padding_val = get_same_padding(kernel_size[0], stride[0], dilation[0]) - self.use_conv_mask = use_conv_mask - self.conv = nn.ModuleList() - inplanes_loop = inplanes - for _ in range(repeat - 1): - self.conv.extend( - self._get_conv_bn_layer(inplanes_loop, planes, kernel_size=kernel_size, - stride=stride, dilation=dilation, - padding=padding_val)) - self.conv.extend( - self._get_act_dropout_layer(drop_prob=dropout, activation=activation)) - inplanes_loop = planes - self.conv.extend( - self._get_conv_bn_layer(inplanes_loop, planes, kernel_size=kernel_size, - stride=stride, dilation=dilation, - padding=padding_val)) - - self.res = nn.ModuleList() if residual else None - res_panes = residual_panes.copy() - self.dense_residual = residual - if residual: - if len(residual_panes) == 0: - res_panes = [inplanes] - self.dense_residual = False - for ip in res_panes: - self.res.append(nn.ModuleList( - modules=self._get_conv_bn_layer(ip, planes, kernel_size=1))) - self.out = nn.Sequential( - *self._get_act_dropout_layer(drop_prob=dropout, activation=activation)) - - def _get_conv_bn_layer(self, in_channels, out_channels, kernel_size=11, - stride=1, dilation=1, padding=0, bias=False): - layers = [ - MaskedConv1d(in_channels, out_channels, kernel_size, stride=stride, - dilation=dilation, padding=padding, bias=bias, - use_conv_mask=self.use_conv_mask), - nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.1) - ] - return layers - - def _get_act_dropout_layer(self, drop_prob=0.2, activation=None): - if activation is None: - activation = nn.Hardtanh(min_val=0.0, max_val=20.0) - layers = [ - activation, - nn.Dropout(p=drop_prob) - ] - return layers - - def num_weights(self): - return sum(p.numel() for p in self.parameters() if p.requires_grad) - - def forward(self, input_): - if self.use_conv_mask: - xs, lens_orig = input_ - else: - xs = input_ - lens_orig = 0 - # compute forward convolutions - out = xs[-1] - lens = lens_orig - for i, l in enumerate(self.conv): - if self.use_conv_mask and isinstance(l, MaskedConv1d): - out, lens = l((out, lens)) - else: - out = l(out) - # compute the residuals - if self.res is not None: - for i, layer in enumerate(self.res): - res_out = xs[i] - for j, res_layer in enumerate(layer): - if j == 0 and self.use_conv_mask: - res_out, _ = res_layer((res_out, lens_orig)) - else: - res_out = res_layer(res_out) - out += res_out - - # compute the output - out = self.out(out) - if self.res is not None and self.dense_residual: - out = xs + [out] - else: - out = [out] - - if self.use_conv_mask: - return out, lens - else: - return out - -class GreedyCTCDecoder(nn.Module): - """ Greedy CTC Decoder - """ - def __init__(self, **kwargs): - nn.Module.__init__(self) # For PyTorch API - @torch.no_grad() - def forward(self, log_probs): - argmx = log_probs.argmax(dim=-1, keepdim=False).int() - return argmx - -class CTCLossNM: - """ CTC loss - """ - def __init__(self, **kwargs): - self._blank = kwargs['num_classes'] - 1 - self._criterion = nn.CTCLoss(blank=self._blank, reduction='none') - - def __call__(self, log_probs, targets, input_length, target_length): - input_length = input_length.long() - target_length = target_length.long() - targets = targets.long() - loss = self._criterion(log_probs.transpose(1, 0), targets, input_length, - target_length) - # note that this is different from reduction = 'mean' - # because we are not dividing by target lengths - return torch.mean(loss) diff --git a/PyTorch/SpeechRecognition/Jasper/notebooks/Colab_Jasper_TRT_inference_demo.ipynb b/PyTorch/SpeechRecognition/Jasper/notebooks/Colab_Jasper_TRT_inference_demo.ipynb index 1aa85c97..cd1c639d 100644 --- a/PyTorch/SpeechRecognition/Jasper/notebooks/Colab_Jasper_TRT_inference_demo.ipynb +++ b/PyTorch/SpeechRecognition/Jasper/notebooks/Colab_Jasper_TRT_inference_demo.ipynb @@ -1,4835 +1,981 @@ { - "nbformat": 4, - "nbformat_minor": 0, - "metadata": { - "accelerator": "GPU", - "colab": { - "name": "Copy of Colab_Jasper_TRT_inference_demo.ipynb", - "provenance": [], - "include_colab_link": true - }, - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.8" - } + "cells": [ + { + "cell_type": "raw", + "metadata": { + "colab_type": "text", + "id": "view-in-github" + }, + "source": [ + "\"Open" + ] }, - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "id": "view-in-github", - "colab_type": "text" - }, - "source": [ - "\"Open" - ] + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "Gwt7z7qdmTbW" + }, + "outputs": [], + "source": [ + "# Copyright 2019 NVIDIA Corporation. All Rights Reserved.\n", + "#\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# http://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License.\n", + "# ==============================================================================" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "i4NKCp2VmTbn" + }, + "source": [ + "\n", + "\n", + "# Jasper Inference Demo with NVIDIA TensorRT on Google Colab" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "fW0OKDzvmTbt" + }, + "source": [ + "## Overview\n", + "\n", + "\n", + "In this notebook, we will demo the process of carrying out inference on new audio segment using a pre-trained Pytorch Jasper model downloaded from the NVIDIA NGC Model registry with TensorRT (TRT). NVIDIA TensorRT is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. After optimizing the compute-intensive acoustic model with NVIDIA TensorRT, inference throughput increased by up to 1.8x over native PyTorch.\n", + "\n", + "The Jasper model is an end-to-end neural acoustic model for automatic speech recognition (ASR) that provides near state-of-the-art results on LibriSpeech among end-to-end ASR models without any external data. The Jasper architecture of convolutional layers was designed to facilitate fast GPU inference, by allowing whole sub-blocks to be fused into a single GPU kernel. This is important for meeting strict real-time requirements of ASR systems in deployment.The results of the acoustic model are combined with the results of external language models to get the top-ranked word sequences corresponding to a given audio segment. This post-processing step is called decoding.\n", + "\n", + "The original paper is Jasper: An End-to-End Convolutional Neural Acoustic Model https://arxiv.org/pdf/1904.03288.pdf.\n", + "\n", + "### Model architecture\n", + "By default the model configuration is Jasper 10x5 with dense residuals. A Jasper BxR model has B blocks, each consisting of R repeating sub-blocks.\n", + "Each sub-block applies the following operations in sequence: 1D-Convolution, Batch Normalization, ReLU activation, and Dropout. \n", + "In the original paper Jasper is trained with masked convolutions, which masks out the padded part of an input sequence in a batch before the 1D-Convolution.\n", + "For inference masking is not used. The reason for this is that in inference, the original mask operation does not achieve better accuracy than without the mask operation on the test and development dataset. However, no masking achieves better inference performance especially after TensorRT optimization.\n", + "More information on the model architecture can be found in the [root folder](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper)\n", + "\n", + "### TensorRT Inference pipeline\n", + "The Jasper inference pipeline consists of 3 components: data preprocessor, acoustic model and greedy decoder. The acoustic model is the most compute intensive, taking more than 90% of the entire end-to-end pipeline. The acoustic model is the only component with learnable parameters and also what differentiates Jasper from the competition. So, we focus on the acoustic model for the most part.\n", + "For the non-TRT Jasper inference pipeline, all 3 components are implemented and run with native PyTorch. For the TensorRT inference pipeline, we show the speedup of running the acoustic model with TensorRT, while preprocessing and decoding are reused from the native PyTorch pipeline.\n", + "To run a model with TensorRT, we first construct the model in PyTorch, which is then exported into an ONNX file. Finally, a TensorRT engine is constructed from the ONNX file, serialized to TRT plan file, and also launched to do inference.\n", + "Note that TensorRT engine is being runtime optimized before serialization. TRT tries a vast set of options to find the strategy that performs best on user’s GPU - so it takes a few minutes. After the TRT plan file is created, it can be reused.\n", + "\n", + "\n", + "### Requirement\n", + "1. Before running this notebook, please set the Colab runtime environment to GPU via the menu *Runtime => Change runtime type => GPU*.\n", + "\n", + "For TRT FP16 and INT8 inference, an NVIDIA Volta, Turing or newer GPU generations is required. On Google Colab, this normally means a T4 GPU." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 316 }, + "colab_type": "code", + "id": "HVsrGkj4Zn2L", + "outputId": "7061e16f-7d7c-4be9-cba4-4c8f67fdafb0" + }, + "outputs": [ { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "Gwt7z7qdmTbW", - "colab": {} - }, - "source": [ - "# Copyright 2019 NVIDIA Corporation. All Rights Reserved.\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# http://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License.\n", - "# ==============================================================================" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "i4NKCp2VmTbn" - }, - "source": [ - "\n", - "\n", - "# Jasper Inference Demo with NVIDIA TensorRT on Google Colab" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "fW0OKDzvmTbt" - }, - "source": [ - "## Overview\n", - "\n", - "\n", - "In this notebook, we will demo the process of carrying out inference on new audio segment using a pre-trained Pytorch Jasper model downloaded from the NVIDIA NGC Model registry with TensorRT (TRT). NVIDIA TensorRT is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. After optimizing the compute-intensive acoustic model with NVIDIA TensorRT, inference throughput increased by up to 1.8x over native PyTorch.\n", - "\n", - "The Jasper model is an end-to-end neural acoustic model for automatic speech recognition (ASR) that provides near state-of-the-art results on LibriSpeech among end-to-end ASR models without any external data. The Jasper architecture of convolutional layers was designed to facilitate fast GPU inference, by allowing whole sub-blocks to be fused into a single GPU kernel. This is important for meeting strict real-time requirements of ASR systems in deployment.The results of the acoustic model are combined with the results of external language models to get the top-ranked word sequences corresponding to a given audio segment. This post-processing step is called decoding.\n", - "\n", - "The original paper is Jasper: An End-to-End Convolutional Neural Acoustic Model https://arxiv.org/pdf/1904.03288.pdf.\n", - "\n", - "### Model architecture\n", - "By default the model configuration is Jasper 10x5 with dense residuals. A Jasper BxR model has B blocks, each consisting of R repeating sub-blocks.\n", - "Each sub-block applies the following operations in sequence: 1D-Convolution, Batch Normalization, ReLU activation, and Dropout. \n", - "In the original paper Jasper is trained with masked convolutions, which masks out the padded part of an input sequence in a batch before the 1D-Convolution.\n", - "For inference masking is not used. The reason for this is that in inference, the original mask operation does not achieve better accuracy than without the mask operation on the test and development dataset. However, no masking achieves better inference performance especially after TensorRT optimization.\n", - "More information on the model architecture can be found in the [root folder](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper)\n", - "\n", - "### TensorRT Inference pipeline\n", - "The Jasper inference pipeline consists of 3 components: data preprocessor, acoustic model and greedy decoder. The acoustic model is the most compute intensive, taking more than 90% of the entire end-to-end pipeline. The acoustic model is the only component with learnable parameters and also what differentiates Jasper from the competition. So, we focus on the acoustic model for the most part.\n", - "For the non-TRT Jasper inference pipeline, all 3 components are implemented and run with native PyTorch. For the TensorRT inference pipeline, we show the speedup of running the acoustic model with TensorRT, while preprocessing and decoding are reused from the native PyTorch pipeline.\n", - "To run a model with TensorRT, we first construct the model in PyTorch, which is then exported into an ONNX file. Finally, a TensorRT engine is constructed from the ONNX file, serialized to TRT plan file, and also launched to do inference.\n", - "Note that TensorRT engine is being runtime optimized before serialization. TRT tries a vast set of options to find the strategy that performs best on user’s GPU - so it takes a few minutes. After the TRT plan file is created, it can be reused.\n", - "\n", - "\n", - "### Requirement\n", - "1. Before running this notebook, please set the Colab runtime environment to GPU via the menu *Runtime => Change runtime type => GPU*.\n", - "\n", - "For TRT FP16 and INT8 inference, an NVIDIA Volta, Turing or newer GPU generations is required. On Google Colab, this normally means a T4 GPU." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "HVsrGkj4Zn2L", - "outputId": "c1222ff2-abb7-48a6-f067-cffedbae4b9c", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 326 - } - }, - "source": [ - "!nvidia-smi" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Sat May 30 01:05:13 2020 \n", - "+-----------------------------------------------------------------------------+\n", - "| NVIDIA-SMI 440.82 Driver Version: 418.67 CUDA Version: 10.1 |\n", - "|-------------------------------+----------------------+----------------------+\n", - "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", - "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", - "|===============================+======================+======================|\n", - "| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |\n", - "| N/A 33C P0 26W / 250W | 0MiB / 16280MiB | 0% Default |\n", - "+-------------------------------+----------------------+----------------------+\n", - " \n", - "+-----------------------------------------------------------------------------+\n", - "| Processes: GPU Memory |\n", - "| GPU PID Type Process name Usage |\n", - "|=============================================================================|\n", - "| No running processes found |\n", - "+-----------------------------------------------------------------------------+\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "pV3rzgO8-tSK" - }, - "source": [ - "The below code check whether a Tensor core GPU is present." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "Djyvo8mm9poq", - "outputId": "b56a3142-b4ba-43dc-c0a6-02b1f62282db", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 - } - }, - "source": [ - "%tensorflow_version 1.x\n", - "from tensorflow.python.client import device_lib\n", - "\n", - "def check_tensor_core_gpu_present():\n", - " local_device_protos = device_lib.list_local_devices()\n", - " for line in local_device_protos:\n", - " if \"compute capability\" in str(line):\n", - " compute_capability = float(line.physical_device_desc.split(\"compute capability: \")[-1])\n", - " if compute_capability>=7.0:\n", - " return True\n", - " \n", - "print(\"Tensor Core GPU Present:\", check_tensor_core_gpu_present())\n", - "tensor_core_gpu = check_tensor_core_gpu_present()" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "TensorFlow 1.x selected.\n", - "Tensor Core GPU Present: None\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "FCEfkBAbbaLI" - }, - "source": [ - "2. Next, we clone the NVIDIA Github Deep Learning Example repository and set up the workspace." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "y3u_VMjXtAto", - "outputId": "f921ecbd-9d9a-49cf-b74b-a4cfab8c7790", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 144 - } - }, - "source": [ - "!git clone https://github.com/NVIDIA/DeepLearningExamples" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Cloning into 'DeepLearningExamples'...\n", - "remote: Enumerating objects: 51, done.\u001b[K\n", - "remote: Counting objects: 100% (51/51), done.\u001b[K\n", - "remote: Compressing objects: 100% (44/44), done.\u001b[K\n", - "remote: Total 7214 (delta 14), reused 18 (delta 5), pack-reused 7163\u001b[K\n", - "Receiving objects: 100% (7214/7214), 42.37 MiB | 21.77 MiB/s, done.\n", - "Resolving deltas: 100% (3605/3605), done.\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "-rE46y-ftAuQ", - "outputId": "99508d2c-0ba3-4976-8f42-0441f04d4656", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 35 - } - }, - "source": [ - "import os\n", - "\n", - "WORKSPACE_DIR='/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/notebooks'\n", - "os.chdir(WORKSPACE_DIR)\n", - "print (os.getcwd())" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/notebooks\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "yBeZjO4JtAwL" - }, - "source": [ - "## Install NVIDIA TensorRT\n", - "\n", - "We will need to install NVIDIA TensorRT 7.0 runtime environment on Colab. First, check the Colab CUDA installed version." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "LfygzbP1Lz2b", - "outputId": "1d78be6c-ea46-4ec7-ba31-c6ff5cd94ddf", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 - } - }, - "source": [ - "!ls /usr/local/" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "bin cuda-10.0 etc\tinclude LICENSE.txt sbin\t share\txgboost\n", - "cuda cuda-10.1 games\tlib\t man\t setup.cfg src\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "yl4PyTsnQP3U" - }, - "source": [ - "Next, we will need to install the NVIDIA TensorRT version that match the current Colab CUDA version, following the instruction at https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html#maclearn-net-repo-install." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "3WA9N43UTq_c", - "outputId": "df78e3b5-4af9-425c-bc1b-2c7c12d39dc5", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 709 - } - }, - "source": [ - "%%bash\n", - "wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb\n", - "\n", - "dpkg -i nvidia-machine-learning-repo-*.deb\n", - "apt-get update" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Selecting previously unselected package nvidia-machine-learning-repo-ubuntu1804.\n", - "(Reading database ... 144439 files and directories currently installed.)\n", - "Preparing to unpack nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb ...\n", - "Unpacking nvidia-machine-learning-repo-ubuntu1804 (1.0.0-1) ...\n", - "Setting up nvidia-machine-learning-repo-ubuntu1804 (1.0.0-1) ...\n", - "Hit:1 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease\n", - "Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease\n", - "Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]\n", - "Get:4 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]\n", - "Get:5 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]\n", - "Get:6 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic InRelease [15.4 kB]\n", - "Get:7 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]\n", - "Ign:8 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease\n", - "Hit:9 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release\n", - "Ign:10 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease\n", - "Hit:11 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release\n", - "Get:12 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main Sources [1,819 kB]\n", - "Get:13 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main amd64 Packages [877 kB]\n", - "Get:14 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [931 kB]\n", - "Get:15 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [854 kB]\n", - "Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [1,228 kB]\n", - "Get:17 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1,385 kB]\n", - "Fetched 7,366 kB in 2s (3,843 kB/s)\n", - "Reading package lists...\n" - ], - "name": "stdout" - }, - { - "output_type": "stream", - "text": [ - "--2020-05-30 01:06:38-- https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb\n", - "Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126\n", - "Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected.\n", - "HTTP request sent, awaiting response... 200 OK\n", - "Length: 2926 (2.9K) [application/x-deb]\n", - "Saving to: ‘nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb’\n", - "\n", - " 0K .. 100% 156M=0s\n", - "\n", - "2020-05-30 01:06:38 (156 MB/s) - ‘nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb’ saved [2926/2926]\n", - "\n", - "W: Target Packages (Packages) is configured multiple times in /etc/apt/sources.list.d/nvidia-machine-learning.list:1 and /etc/apt/sources.list.d/nvidia-ml.list:1\n", - "W: Target Packages (Packages) is configured multiple times in /etc/apt/sources.list.d/nvidia-machine-learning.list:1 and /etc/apt/sources.list.d/nvidia-ml.list:1\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "XPsJhzqLXDAm" - }, - "source": [ - "When using the NVIDIA Machine Learning network repository, Ubuntu will be default install TensorRT for the latest CUDA version. Replace `7.0.0` with your version of TensorRT and `cuda10.0` with the CUDA version on your Colab environment. Browse https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 for a list of available TensorRT versions." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "EbF-JWGfK9Lo", - "outputId": "42caec29-ac37-4bda-c8dc-df066583c575", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 1000 - } - }, - "source": [ - "%%bash\n", - "version=\"7.0.0-1+cuda10.0\"\n", - "sudo apt-get install libnvinfer7=${version} libnvonnxparsers7=${version} libnvparsers7=${version} libnvinfer-plugin7=${version} libnvinfer-dev=${version} libnvonnxparsers-dev=${version} libnvparsers-dev=${version} libnvinfer-plugin-dev=${version} python-libnvinfer=${version} python3-libnvinfer=${version}\n", - "\n" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Reading package lists...\n", - "Building dependency tree...\n", - "Reading state information...\n", - "The following NEW packages will be installed:\n", - " libnvinfer-dev libnvinfer-plugin-dev libnvinfer-plugin7 libnvinfer7\n", - " libnvonnxparsers-dev libnvonnxparsers7 libnvparsers-dev libnvparsers7\n", - " python-libnvinfer python3-libnvinfer\n", - "0 upgraded, 10 newly installed, 0 to remove and 38 not upgraded.\n", - "Need to get 148 MB of archives.\n", - "After this operation, 544 MB of additional disk space will be used.\n", - "Get:1 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvinfer7 7.0.0-1+cuda10.0 [69.6 MB]\n", - "Get:2 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvinfer-dev 7.0.0-1+cuda10.0 [71.6 MB]\n", - "Get:3 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvinfer-plugin7 7.0.0-1+cuda10.0 [2,109 kB]\n", - "Get:4 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvinfer-plugin-dev 7.0.0-1+cuda10.0 [2,177 kB]\n", - "Get:5 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvonnxparsers7 7.0.0-1+cuda10.0 [593 kB]\n", - "Get:6 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvonnxparsers-dev 7.0.0-1+cuda10.0 [295 kB]\n", - "Get:7 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvparsers7 7.0.0-1+cuda10.0 [791 kB]\n", - "Get:8 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 libnvparsers-dev 7.0.0-1+cuda10.0 [540 kB]\n", - "Get:9 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 python-libnvinfer 7.0.0-1+cuda10.0 [359 kB]\n", - "Get:10 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 python3-libnvinfer 7.0.0-1+cuda10.0 [355 kB]\n", - "Fetched 148 MB in 15s (9,607 kB/s)\n", - "Selecting previously unselected package libnvinfer7.\r\n", - "(Reading database ... \r(Reading database ... 5%\r(Reading database ... 10%\r(Reading database ... 15%\r(Reading database ... 20%\r(Reading database ... 25%\r(Reading database ... 30%\r(Reading database ... 35%\r(Reading database ... 40%\r(Reading database ... 45%\r(Reading database ... 50%\r(Reading database ... 55%\r(Reading database ... 60%\r(Reading database ... 65%\r(Reading database ... 70%\r(Reading database ... 75%\r(Reading database ... 80%\r(Reading database ... 85%\r(Reading database ... 90%\r(Reading database ... 95%\r(Reading database ... 100%\r(Reading database ... 144442 files and directories currently installed.)\r\n", - "Preparing to unpack .../0-libnvinfer7_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvinfer7 (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package libnvinfer-dev.\r\n", - "Preparing to unpack .../1-libnvinfer-dev_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvinfer-dev (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package libnvinfer-plugin7.\r\n", - "Preparing to unpack .../2-libnvinfer-plugin7_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvinfer-plugin7 (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package libnvinfer-plugin-dev.\r\n", - "Preparing to unpack .../3-libnvinfer-plugin-dev_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvinfer-plugin-dev (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package libnvonnxparsers7.\r\n", - "Preparing to unpack .../4-libnvonnxparsers7_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvonnxparsers7 (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package libnvonnxparsers-dev.\r\n", - "Preparing to unpack .../5-libnvonnxparsers-dev_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvonnxparsers-dev (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package libnvparsers7.\r\n", - "Preparing to unpack .../6-libnvparsers7_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvparsers7 (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package libnvparsers-dev.\r\n", - "Preparing to unpack .../7-libnvparsers-dev_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking libnvparsers-dev (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package python-libnvinfer.\r\n", - "Preparing to unpack .../8-python-libnvinfer_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking python-libnvinfer (7.0.0-1+cuda10.0) ...\r\n", - "Selecting previously unselected package python3-libnvinfer.\r\n", - "Preparing to unpack .../9-python3-libnvinfer_7.0.0-1+cuda10.0_amd64.deb ...\r\n", - "Unpacking python3-libnvinfer (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvinfer7 (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvinfer-dev (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvinfer-plugin7 (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvparsers7 (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvonnxparsers7 (7.0.0-1+cuda10.0) ...\r\n", - "Setting up python-libnvinfer (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvinfer-plugin-dev (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvparsers-dev (7.0.0-1+cuda10.0) ...\r\n", - "Setting up python3-libnvinfer (7.0.0-1+cuda10.0) ...\r\n", - "Setting up libnvonnxparsers-dev (7.0.0-1+cuda10.0) ...\r\n", - "Processing triggers for libc-bin (2.27-3ubuntu1) ...\r\n", - "/sbin/ldconfig.real: /usr/local/lib/python3.6/dist-packages/ideep4py/lib/libmkldnn.so.0 is not a symbolic link\r\n", - "\r\n" - ], - "name": "stdout" - }, - { - "output_type": "stream", - "text": [ - "debconf: unable to initialize frontend: Dialog\n", - "debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76, <> line 10.)\n", - "debconf: falling back to frontend: Readline\n", - "debconf: unable to initialize frontend: Readline\n", - "debconf: (This frontend requires a controlling tty.)\n", - "debconf: falling back to frontend: Teletype\n", - "dpkg-preconfigure: unable to re-open stdin: \n", - "W: Target Packages (Packages) is configured multiple times in /etc/apt/sources.list.d/nvidia-machine-learning.list:1 and /etc/apt/sources.list.d/nvidia-ml.list:1\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "wOo7YbuhLcUU", - "outputId": "cc0c04e6-1ee6-4089-dfd4-f644f6e065c8", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 219 - } - }, - "source": [ - "!dpkg -l | grep TensorRT" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "ii libnvinfer-dev 7.0.0-1+cuda10.0 amd64 TensorRT development libraries and headers\n", - "ii libnvinfer-plugin-dev 7.0.0-1+cuda10.0 amd64 TensorRT plugin libraries\n", - "ii libnvinfer-plugin7 7.0.0-1+cuda10.0 amd64 TensorRT plugin libraries\n", - "ii libnvinfer7 7.0.0-1+cuda10.0 amd64 TensorRT runtime libraries\n", - "ii libnvonnxparsers-dev 7.0.0-1+cuda10.0 amd64 TensorRT ONNX libraries\n", - "ii libnvonnxparsers7 7.0.0-1+cuda10.0 amd64 TensorRT ONNX libraries\n", - "ii libnvparsers-dev 7.0.0-1+cuda10.0 amd64 TensorRT parsers libraries\n", - "ii libnvparsers7 7.0.0-1+cuda10.0 amd64 TensorRT parsers libraries\n", - "ii python-libnvinfer 7.0.0-1+cuda10.0 amd64 Python bindings for TensorRT\n", - "ii python3-libnvinfer 7.0.0-1+cuda10.0 amd64 Python 3 bindings for TensorRT\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "XRMZiFjdUPCZ" - }, - "source": [ - "A successful TensorRT installation should look like:\n", - "\n", - "```\n", - "ii libnvinfer-dev 7.0.0-1+cuda10.0 amd64 TensorRT development libraries and headers\n", - "ii libnvinfer-plugin-dev 7.0.0-1+cuda10.0 amd64 TensorRT plugin libraries\n", - "ii libnvinfer-plugin7 7.0.0-1+cuda10.0 amd64 TensorRT plugin libraries\n", - "ii libnvinfer7 7.0.0-1+cuda10.0 amd64 TensorRT runtime libraries\n", - "ii libnvonnxparsers-dev 7.0.0-1+cuda10.0 amd64 TensorRT ONNX libraries\n", - "ii libnvonnxparsers7 7.0.0-1+cuda10.0 amd64 TensorRT ONNX libraries\n", - "ii libnvparsers-dev 7.0.0-1+cuda10.0 amd64 TensorRT parsers libraries\n", - "ii libnvparsers7 7.0.0-1+cuda10.0 amd64 TensorRT parsers libraries\n", - "ii python-libnvinfer 7.0.0-1+cuda10.0 amd64 Python bindings for TensorRT\n", - "ii python3-libnvinfer 7.0.0-1+cuda10.0 amd64 Python 3 bindings for TensorRT\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "lwpwNNw8FS8F" - }, - "source": [ - "## Download pretrained Jasper model from NVIDIA GPU Cloud model repository\n", - "\n", - "NVIDIA provides pretrained Jasper models along with many other deep learning models such as ResNet, BERT, Transformer, SSD... at https://ngc.nvidia.com/catalog/models. Here, we will download and unzip pretrained Jasper Pytorch [models](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16)." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "np8WaN_FFaTF", - "outputId": "ecd47795-2c21-4369-a7d9-64f6d3ab52f3", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 199 - } - }, - "source": [ - "%%bash \n", - "wget -nc -q --show-progress -O jasper_model.zip \\\n", - "https://api.ngc.nvidia.com/v2/models/nvidia/jasperpyt_fp16/versions/1/zip" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "IOPub data rate exceeded.\n", - "The notebook server will temporarily stop sending output\n", - "to the client in order to avoid crashing it.\n", - "To change this limit, set the config variable\n", - "`--NotebookApp.iopub_data_rate_limit`.\n", - "\n", - "Current values:\n", - "NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)\n", - "NotebookApp.rate_limit_window=3.0 (secs)\n", - "\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "gsJBwUgXHEkE", - "outputId": "82ee9e38-8aa9-4ee3-9671-d5e01c772ca4", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 53 - } - }, - "source": [ - "!unzip -o ./jasper_model.zip" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Archive: ./jasper_model.zip\n", - " inflating: jasper_fp16.pt \n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "hbafoHBMXr0E" - }, - "source": [ - "After a successful download, a Pytorch checkpoint named ` jasper_fp16.pt` should exist in the current notebooks directory." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "YC2Fu9rWG70U", - "outputId": "95cdf660-c016-4764-bcfe-a7b90afcf215", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 35 - } - }, - "source": [ - "!ls -l jasper_fp16.pt " - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "-rw-r--r-- 1 root root 2661855989 Sep 10 2019 jasper_fp16.pt\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "z-1-63MOG0um" - }, - "source": [ - "## Install extra dependencies\n", - "\n", - "Before proceeding to creating the TensorRT execution engine from the Pytorch checkpoint, we shall install some extra dependency to load and convert the Pytorch model and process input audio files.\n", - "\n", - "- [Apex](https://nvidia.github.io/apex/): this is NVIDIA libraries for automatic mixed precision training in Pytorch\n", - "- [Onnx](https://github.com/onnx/onnx): for processing ONNX model.\n", - "- unidecode, soundfile, toml, pycuda: miscellaneous helper libraries\n", - "\n" - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "A6QC4ngRHw3a", - "outputId": "da11c0b6-45bc-4f0c-d43b-995a556055e4", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 1000 - } - }, - "source": [ - "%%bash \n", - "pip uninstall -y apex\n", - "git clone https://www.github.com/nvidia/apex\n", - "cd apex\n", - "python setup.py install\n" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "torch.__version__ = 1.5.0+cu101\n", - "running install\n", - "running bdist_egg\n", - "running egg_info\n", - "creating apex.egg-info\n", - "writing apex.egg-info/PKG-INFO\n", - "writing dependency_links to apex.egg-info/dependency_links.txt\n", - "writing top-level names to apex.egg-info/top_level.txt\n", - "writing manifest file 'apex.egg-info/SOURCES.txt'\n", - "writing manifest file 'apex.egg-info/SOURCES.txt'\n", - "installing library code to build/bdist.linux-x86_64/egg\n", - "running install_lib\n", - "running build_py\n", - "creating build\n", - "creating build/lib\n", - "creating build/lib/apex\n", - "copying apex/__init__.py -> build/lib/apex\n", - "creating build/lib/apex/reparameterization\n", - "copying apex/reparameterization/reparameterization.py -> build/lib/apex/reparameterization\n", - "copying apex/reparameterization/weight_norm.py -> build/lib/apex/reparameterization\n", - "copying apex/reparameterization/__init__.py -> build/lib/apex/reparameterization\n", - "creating build/lib/apex/RNN\n", - "copying apex/RNN/RNNBackend.py -> build/lib/apex/RNN\n", - "copying apex/RNN/__init__.py -> build/lib/apex/RNN\n", - "copying apex/RNN/cells.py -> build/lib/apex/RNN\n", - "copying apex/RNN/models.py -> build/lib/apex/RNN\n", - "creating build/lib/apex/contrib\n", - "copying apex/contrib/__init__.py -> build/lib/apex/contrib\n", - "creating build/lib/apex/amp\n", - "copying apex/amp/_amp_state.py -> build/lib/apex/amp\n", - "copying apex/amp/utils.py -> build/lib/apex/amp\n", - "copying apex/amp/handle.py -> build/lib/apex/amp\n", - "copying apex/amp/amp.py -> build/lib/apex/amp\n", - "copying apex/amp/rnn_compat.py -> build/lib/apex/amp\n", - "copying apex/amp/compat.py -> build/lib/apex/amp\n", - "copying apex/amp/opt.py -> build/lib/apex/amp\n", - "copying apex/amp/_process_optimizer.py -> build/lib/apex/amp\n", - "copying apex/amp/scaler.py -> build/lib/apex/amp\n", - "copying apex/amp/__init__.py -> build/lib/apex/amp\n", - "copying apex/amp/wrap.py -> build/lib/apex/amp\n", - "copying apex/amp/_initialize.py -> build/lib/apex/amp\n", - "copying apex/amp/frontend.py -> build/lib/apex/amp\n", - "copying apex/amp/__version__.py -> build/lib/apex/amp\n", - "creating build/lib/apex/mlp\n", - "copying apex/mlp/mlp.py -> build/lib/apex/mlp\n", - "copying apex/mlp/__init__.py -> build/lib/apex/mlp\n", - "creating build/lib/apex/pyprof\n", - "copying apex/pyprof/__init__.py -> build/lib/apex/pyprof\n", - "creating build/lib/apex/parallel\n", - "copying apex/parallel/multiproc.py -> build/lib/apex/parallel\n", - "copying apex/parallel/sync_batchnorm_kernel.py -> build/lib/apex/parallel\n", - "copying apex/parallel/sync_batchnorm.py -> build/lib/apex/parallel\n", - "copying apex/parallel/optimized_sync_batchnorm.py -> build/lib/apex/parallel\n", - "copying apex/parallel/distributed.py -> build/lib/apex/parallel\n", - "copying apex/parallel/optimized_sync_batchnorm_kernel.py -> build/lib/apex/parallel\n", - "copying apex/parallel/__init__.py -> build/lib/apex/parallel\n", - "copying apex/parallel/LARC.py -> build/lib/apex/parallel\n", - "creating build/lib/apex/normalization\n", - "copying apex/normalization/fused_layer_norm.py -> build/lib/apex/normalization\n", - "copying apex/normalization/__init__.py -> build/lib/apex/normalization\n", - "creating build/lib/apex/optimizers\n", - "copying apex/optimizers/fused_sgd.py -> build/lib/apex/optimizers\n", - "copying apex/optimizers/fused_adagrad.py -> build/lib/apex/optimizers\n", - "copying apex/optimizers/fused_novograd.py -> build/lib/apex/optimizers\n", - "copying apex/optimizers/fused_adam.py -> build/lib/apex/optimizers\n", - "copying apex/optimizers/__init__.py -> build/lib/apex/optimizers\n", - "copying apex/optimizers/fused_lamb.py -> build/lib/apex/optimizers\n", - "creating build/lib/apex/multi_tensor_apply\n", - "copying apex/multi_tensor_apply/__init__.py -> build/lib/apex/multi_tensor_apply\n", - "copying apex/multi_tensor_apply/multi_tensor_apply.py -> build/lib/apex/multi_tensor_apply\n", - "creating build/lib/apex/fp16_utils\n", - "copying apex/fp16_utils/fp16util.py -> build/lib/apex/fp16_utils\n", - "copying apex/fp16_utils/loss_scaler.py -> build/lib/apex/fp16_utils\n", - "copying apex/fp16_utils/fp16_optimizer.py -> build/lib/apex/fp16_utils\n", - "copying apex/fp16_utils/__init__.py -> build/lib/apex/fp16_utils\n", - "creating build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/self_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/fast_self_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/encdec_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/fast_encdec_multihead_attn_norm_add_func.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/mask_softmax_dropout_func.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/encdec_multihead_attn.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/__init__.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/fast_self_multihead_attn_norm_add_func.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/self_multihead_attn.py -> build/lib/apex/contrib/multihead_attn\n", - "copying apex/contrib/multihead_attn/fast_encdec_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn\n", - "creating build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/fused_sgd.py -> build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/distributed_fused_adam_v3.py -> build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/fused_adam.py -> build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/fp16_optimizer.py -> build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/distributed_fused_adam_v2.py -> build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/__init__.py -> build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/distributed_fused_adam.py -> build/lib/apex/contrib/optimizers\n", - "copying apex/contrib/optimizers/fused_lamb.py -> build/lib/apex/contrib/optimizers\n", - "creating build/lib/apex/contrib/groupbn\n", - "copying apex/contrib/groupbn/batch_norm.py -> build/lib/apex/contrib/groupbn\n", - "copying apex/contrib/groupbn/__init__.py -> build/lib/apex/contrib/groupbn\n", - "creating build/lib/apex/contrib/xentropy\n", - "copying apex/contrib/xentropy/__init__.py -> build/lib/apex/contrib/xentropy\n", - "copying apex/contrib/xentropy/softmax_xentropy.py -> build/lib/apex/contrib/xentropy\n", - "creating build/lib/apex/amp/lists\n", - "copying apex/amp/lists/tensor_overrides.py -> build/lib/apex/amp/lists\n", - "copying apex/amp/lists/torch_overrides.py -> build/lib/apex/amp/lists\n", - "copying apex/amp/lists/functional_overrides.py -> build/lib/apex/amp/lists\n", - "copying apex/amp/lists/__init__.py -> build/lib/apex/amp/lists\n", - "creating build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/pointwise.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/utility.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/usage.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/__main__.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/prof.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/misc.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/linear.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/reduction.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/softmax.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/activation.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/pooling.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/loss.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/dropout.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/recurrentCell.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/optim.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/index_slice_join_mutate.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/base.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/randomSample.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/output.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/blas.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/data.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/__init__.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/normalization.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/conv.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/convert.py -> build/lib/apex/pyprof/prof\n", - "copying apex/pyprof/prof/embedding.py -> build/lib/apex/pyprof/prof\n", - "creating build/lib/apex/pyprof/nvtx\n", - "copying apex/pyprof/nvtx/__init__.py -> build/lib/apex/pyprof/nvtx\n", - "copying apex/pyprof/nvtx/nvmarker.py -> build/lib/apex/pyprof/nvtx\n", - "creating build/lib/apex/pyprof/parse\n", - "copying apex/pyprof/parse/__main__.py -> build/lib/apex/pyprof/parse\n", - "copying apex/pyprof/parse/db.py -> build/lib/apex/pyprof/parse\n", - "copying apex/pyprof/parse/nvvp.py -> build/lib/apex/pyprof/parse\n", - "copying apex/pyprof/parse/parse.py -> build/lib/apex/pyprof/parse\n", - "copying apex/pyprof/parse/kernel.py -> build/lib/apex/pyprof/parse\n", - "copying apex/pyprof/parse/__init__.py -> build/lib/apex/pyprof/parse\n", - "creating build/bdist.linux-x86_64\n", - "creating build/bdist.linux-x86_64/egg\n", - "creating build/bdist.linux-x86_64/egg/apex\n", - "creating build/bdist.linux-x86_64/egg/apex/reparameterization\n", - "copying build/lib/apex/reparameterization/reparameterization.py -> build/bdist.linux-x86_64/egg/apex/reparameterization\n", - "copying build/lib/apex/reparameterization/weight_norm.py -> build/bdist.linux-x86_64/egg/apex/reparameterization\n", - "copying build/lib/apex/reparameterization/__init__.py -> build/bdist.linux-x86_64/egg/apex/reparameterization\n", - "creating build/bdist.linux-x86_64/egg/apex/RNN\n", - "copying build/lib/apex/RNN/RNNBackend.py -> build/bdist.linux-x86_64/egg/apex/RNN\n", - "copying build/lib/apex/RNN/__init__.py -> build/bdist.linux-x86_64/egg/apex/RNN\n", - "copying build/lib/apex/RNN/cells.py -> build/bdist.linux-x86_64/egg/apex/RNN\n", - "copying build/lib/apex/RNN/models.py -> build/bdist.linux-x86_64/egg/apex/RNN\n", - "creating build/bdist.linux-x86_64/egg/apex/contrib\n", - "creating build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/self_multihead_attn_func.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/fast_self_multihead_attn_func.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/encdec_multihead_attn_func.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/fast_encdec_multihead_attn_norm_add_func.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/mask_softmax_dropout_func.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/encdec_multihead_attn.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/__init__.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/fast_self_multihead_attn_norm_add_func.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/self_multihead_attn.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "copying build/lib/apex/contrib/multihead_attn/fast_encdec_multihead_attn_func.py -> build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn\n", - "creating build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/fused_sgd.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/distributed_fused_adam_v3.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/fused_adam.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/fp16_optimizer.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/distributed_fused_adam_v2.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/__init__.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/distributed_fused_adam.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/optimizers/fused_lamb.py -> build/bdist.linux-x86_64/egg/apex/contrib/optimizers\n", - "copying build/lib/apex/contrib/__init__.py -> build/bdist.linux-x86_64/egg/apex/contrib\n", - "creating build/bdist.linux-x86_64/egg/apex/contrib/groupbn\n", - "copying build/lib/apex/contrib/groupbn/batch_norm.py -> build/bdist.linux-x86_64/egg/apex/contrib/groupbn\n", - "copying build/lib/apex/contrib/groupbn/__init__.py -> build/bdist.linux-x86_64/egg/apex/contrib/groupbn\n", - "creating build/bdist.linux-x86_64/egg/apex/contrib/xentropy\n", - "copying build/lib/apex/contrib/xentropy/__init__.py -> build/bdist.linux-x86_64/egg/apex/contrib/xentropy\n", - "copying build/lib/apex/contrib/xentropy/softmax_xentropy.py -> build/bdist.linux-x86_64/egg/apex/contrib/xentropy\n", - "creating build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/_amp_state.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/utils.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/handle.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/amp.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/rnn_compat.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/compat.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/opt.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/_process_optimizer.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/scaler.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/__init__.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "creating build/bdist.linux-x86_64/egg/apex/amp/lists\n", - "copying build/lib/apex/amp/lists/tensor_overrides.py -> build/bdist.linux-x86_64/egg/apex/amp/lists\n", - "copying build/lib/apex/amp/lists/torch_overrides.py -> build/bdist.linux-x86_64/egg/apex/amp/lists\n", - "copying build/lib/apex/amp/lists/functional_overrides.py -> build/bdist.linux-x86_64/egg/apex/amp/lists\n", - "copying build/lib/apex/amp/lists/__init__.py -> build/bdist.linux-x86_64/egg/apex/amp/lists\n", - "copying build/lib/apex/amp/wrap.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/_initialize.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/frontend.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "copying build/lib/apex/amp/__version__.py -> build/bdist.linux-x86_64/egg/apex/amp\n", - "creating build/bdist.linux-x86_64/egg/apex/mlp\n", - "copying build/lib/apex/mlp/mlp.py -> build/bdist.linux-x86_64/egg/apex/mlp\n", - "copying build/lib/apex/mlp/__init__.py -> build/bdist.linux-x86_64/egg/apex/mlp\n", - "creating build/bdist.linux-x86_64/egg/apex/pyprof\n", - "creating build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/pointwise.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/utility.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/usage.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/__main__.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/prof.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/misc.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/linear.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/reduction.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/softmax.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/activation.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/pooling.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/loss.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/dropout.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/recurrentCell.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/optim.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/index_slice_join_mutate.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/base.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/randomSample.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/output.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/blas.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/data.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/__init__.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/normalization.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/conv.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/convert.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/prof/embedding.py -> build/bdist.linux-x86_64/egg/apex/pyprof/prof\n", - "copying build/lib/apex/pyprof/__init__.py -> build/bdist.linux-x86_64/egg/apex/pyprof\n", - "creating build/bdist.linux-x86_64/egg/apex/pyprof/nvtx\n", - "copying build/lib/apex/pyprof/nvtx/__init__.py -> build/bdist.linux-x86_64/egg/apex/pyprof/nvtx\n", - "copying build/lib/apex/pyprof/nvtx/nvmarker.py -> build/bdist.linux-x86_64/egg/apex/pyprof/nvtx\n", - "creating build/bdist.linux-x86_64/egg/apex/pyprof/parse\n", - "copying build/lib/apex/pyprof/parse/__main__.py -> build/bdist.linux-x86_64/egg/apex/pyprof/parse\n", - "copying build/lib/apex/pyprof/parse/db.py -> build/bdist.linux-x86_64/egg/apex/pyprof/parse\n", - "copying build/lib/apex/pyprof/parse/nvvp.py -> build/bdist.linux-x86_64/egg/apex/pyprof/parse\n", - "copying build/lib/apex/pyprof/parse/parse.py -> build/bdist.linux-x86_64/egg/apex/pyprof/parse\n", - "copying build/lib/apex/pyprof/parse/kernel.py -> build/bdist.linux-x86_64/egg/apex/pyprof/parse\n", - "copying build/lib/apex/pyprof/parse/__init__.py -> build/bdist.linux-x86_64/egg/apex/pyprof/parse\n", - "creating build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/multiproc.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/sync_batchnorm_kernel.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/sync_batchnorm.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/optimized_sync_batchnorm.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/distributed.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/optimized_sync_batchnorm_kernel.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/__init__.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "copying build/lib/apex/parallel/LARC.py -> build/bdist.linux-x86_64/egg/apex/parallel\n", - "creating build/bdist.linux-x86_64/egg/apex/normalization\n", - "copying build/lib/apex/normalization/fused_layer_norm.py -> build/bdist.linux-x86_64/egg/apex/normalization\n", - "copying build/lib/apex/normalization/__init__.py -> build/bdist.linux-x86_64/egg/apex/normalization\n", - "creating build/bdist.linux-x86_64/egg/apex/optimizers\n", - "copying build/lib/apex/optimizers/fused_sgd.py -> build/bdist.linux-x86_64/egg/apex/optimizers\n", - "copying build/lib/apex/optimizers/fused_adagrad.py -> build/bdist.linux-x86_64/egg/apex/optimizers\n", - "copying build/lib/apex/optimizers/fused_novograd.py -> build/bdist.linux-x86_64/egg/apex/optimizers\n", - "copying build/lib/apex/optimizers/fused_adam.py -> build/bdist.linux-x86_64/egg/apex/optimizers\n", - "copying build/lib/apex/optimizers/__init__.py -> build/bdist.linux-x86_64/egg/apex/optimizers\n", - "copying build/lib/apex/optimizers/fused_lamb.py -> build/bdist.linux-x86_64/egg/apex/optimizers\n", - "copying build/lib/apex/__init__.py -> build/bdist.linux-x86_64/egg/apex\n", - "creating build/bdist.linux-x86_64/egg/apex/multi_tensor_apply\n", - "copying build/lib/apex/multi_tensor_apply/__init__.py -> build/bdist.linux-x86_64/egg/apex/multi_tensor_apply\n", - "copying build/lib/apex/multi_tensor_apply/multi_tensor_apply.py -> build/bdist.linux-x86_64/egg/apex/multi_tensor_apply\n", - "creating build/bdist.linux-x86_64/egg/apex/fp16_utils\n", - "copying build/lib/apex/fp16_utils/fp16util.py -> build/bdist.linux-x86_64/egg/apex/fp16_utils\n", - "copying build/lib/apex/fp16_utils/loss_scaler.py -> build/bdist.linux-x86_64/egg/apex/fp16_utils\n", - "copying build/lib/apex/fp16_utils/fp16_optimizer.py -> build/bdist.linux-x86_64/egg/apex/fp16_utils\n", - "copying build/lib/apex/fp16_utils/__init__.py -> build/bdist.linux-x86_64/egg/apex/fp16_utils\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/reparameterization/reparameterization.py to reparameterization.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/reparameterization/weight_norm.py to weight_norm.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/reparameterization/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/RNN/RNNBackend.py to RNNBackend.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/RNN/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/RNN/cells.py to cells.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/RNN/models.py to models.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/self_multihead_attn_func.py to self_multihead_attn_func.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/fast_self_multihead_attn_func.py to fast_self_multihead_attn_func.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/encdec_multihead_attn_func.py to encdec_multihead_attn_func.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/fast_encdec_multihead_attn_norm_add_func.py to fast_encdec_multihead_attn_norm_add_func.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/mask_softmax_dropout_func.py to mask_softmax_dropout_func.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/encdec_multihead_attn.py to encdec_multihead_attn.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/fast_self_multihead_attn_norm_add_func.py to fast_self_multihead_attn_norm_add_func.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/self_multihead_attn.py to self_multihead_attn.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/multihead_attn/fast_encdec_multihead_attn_func.py to fast_encdec_multihead_attn_func.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/fused_sgd.py to fused_sgd.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/distributed_fused_adam_v3.py to distributed_fused_adam_v3.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/fused_adam.py to fused_adam.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/fp16_optimizer.py to fp16_optimizer.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/distributed_fused_adam_v2.py to distributed_fused_adam_v2.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/distributed_fused_adam.py to distributed_fused_adam.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/optimizers/fused_lamb.py to fused_lamb.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/groupbn/batch_norm.py to batch_norm.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/groupbn/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/xentropy/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/contrib/xentropy/softmax_xentropy.py to softmax_xentropy.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/_amp_state.py to _amp_state.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/utils.py to utils.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/handle.py to handle.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/amp.py to amp.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/rnn_compat.py to rnn_compat.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/compat.py to compat.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/opt.py to opt.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/_process_optimizer.py to _process_optimizer.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/scaler.py to scaler.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/lists/tensor_overrides.py to tensor_overrides.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/lists/torch_overrides.py to torch_overrides.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/lists/functional_overrides.py to functional_overrides.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/lists/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/wrap.py to wrap.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/_initialize.py to _initialize.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/frontend.py to frontend.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/amp/__version__.py to __version__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/mlp/mlp.py to mlp.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/mlp/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/pointwise.py to pointwise.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/utility.py to utility.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/usage.py to usage.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/__main__.py to __main__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/prof.py to prof.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/misc.py to misc.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/linear.py to linear.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/reduction.py to reduction.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/softmax.py to softmax.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/activation.py to activation.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/pooling.py to pooling.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/loss.py to loss.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/dropout.py to dropout.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/recurrentCell.py to recurrentCell.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/optim.py to optim.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/index_slice_join_mutate.py to index_slice_join_mutate.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/base.py to base.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/randomSample.py to randomSample.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/output.py to output.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/blas.py to blas.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/data.py to data.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/normalization.py to normalization.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/conv.py to conv.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/convert.py to convert.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/prof/embedding.py to embedding.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/nvtx/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/nvtx/nvmarker.py to nvmarker.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/parse/__main__.py to __main__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/parse/db.py to db.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/parse/nvvp.py to nvvp.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/parse/parse.py to parse.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/parse/kernel.py to kernel.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/pyprof/parse/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/multiproc.py to multiproc.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/sync_batchnorm_kernel.py to sync_batchnorm_kernel.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/sync_batchnorm.py to sync_batchnorm.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/optimized_sync_batchnorm.py to optimized_sync_batchnorm.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/distributed.py to distributed.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/optimized_sync_batchnorm_kernel.py to optimized_sync_batchnorm_kernel.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/parallel/LARC.py to LARC.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/normalization/fused_layer_norm.py to fused_layer_norm.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/normalization/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/optimizers/fused_sgd.py to fused_sgd.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/optimizers/fused_adagrad.py to fused_adagrad.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/optimizers/fused_novograd.py to fused_novograd.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/optimizers/fused_adam.py to fused_adam.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/optimizers/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/optimizers/fused_lamb.py to fused_lamb.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/multi_tensor_apply/__init__.py to __init__.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/multi_tensor_apply/multi_tensor_apply.py to multi_tensor_apply.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/fp16_utils/fp16util.py to fp16util.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/fp16_utils/loss_scaler.py to loss_scaler.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/fp16_utils/fp16_optimizer.py to fp16_optimizer.cpython-36.pyc\n", - "byte-compiling build/bdist.linux-x86_64/egg/apex/fp16_utils/__init__.py to __init__.cpython-36.pyc\n", - "creating build/bdist.linux-x86_64/egg/EGG-INFO\n", - "copying apex.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO\n", - "copying apex.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO\n", - "copying apex.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO\n", - "copying apex.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO\n", - "creating dist\n", - "creating 'dist/apex-0.1-py3.6.egg' and adding 'build/bdist.linux-x86_64/egg' to it\n", - "removing 'build/bdist.linux-x86_64/egg' (and everything under it)\n", - "Processing apex-0.1-py3.6.egg\n", - "creating /usr/local/lib/python3.6/dist-packages/apex-0.1-py3.6.egg\n", - "Extracting apex-0.1-py3.6.egg to /usr/local/lib/python3.6/dist-packages\n", - "Adding apex 0.1 to easy-install.pth file\n", - "\n", - "Installed /usr/local/lib/python3.6/dist-packages/apex-0.1-py3.6.egg\n", - "Processing dependencies for apex==0.1\n", - "Finished processing dependencies for apex==0.1\n" - ], - "name": "stdout" - }, - { - "output_type": "stream", - "text": [ - "WARNING: Skipping apex as it is not installed.\n", - "Cloning into 'apex'...\n", - "warning: redirecting to https://github.com/nvidia/apex.git/\n", - "setup.py:46: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!\n", - " warnings.warn(\"Option --pyprof not specified. Not installing PyProf dependencies!\")\n", - "zip_safe flag not set; analyzing archive contents...\n", - "apex.pyprof.nvtx.__pycache__.nvmarker.cpython-36: module references __file__\n", - "apex.pyprof.nvtx.__pycache__.nvmarker.cpython-36: module references __path__\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "QPJbKeigIdOC", - "outputId": "2a4c86f2-31f1-46a8-eae7-c79827c258ca", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 655 - } - }, - "source": [ - "!pip install unidecode soundfile toml pycuda" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Collecting unidecode\n", - "\u001b[?25l Downloading https://files.pythonhosted.org/packages/d0/42/d9edfed04228bacea2d824904cae367ee9efd05e6cce7ceaaedd0b0ad964/Unidecode-1.1.1-py2.py3-none-any.whl (238kB)\n", - "\r\u001b[K |█▍ | 10kB 21.2MB/s eta 0:00:01\r\u001b[K |██▊ | 20kB 5.1MB/s eta 0:00:01\r\u001b[K |████▏ | 30kB 5.4MB/s eta 0:00:01\r\u001b[K |█████▌ | 40kB 6.2MB/s eta 0:00:01\r\u001b[K |██████▉ | 51kB 6.8MB/s eta 0:00:01\r\u001b[K |████████▎ | 61kB 7.2MB/s eta 0:00:01\r\u001b[K |█████████▋ | 71kB 6.9MB/s eta 0:00:01\r\u001b[K |███████████ | 81kB 6.8MB/s eta 0:00:01\r\u001b[K |████████████▍ | 92kB 6.8MB/s eta 0:00:01\r\u001b[K |█████████████▊ | 102kB 7.2MB/s eta 0:00:01\r\u001b[K |███████████████▏ | 112kB 7.2MB/s eta 0:00:01\r\u001b[K |████████████████▌ | 122kB 7.2MB/s eta 0:00:01\r\u001b[K |█████████████████▉ | 133kB 7.2MB/s eta 0:00:01\r\u001b[K |███████████████████▎ | 143kB 7.2MB/s eta 0:00:01\r\u001b[K |████████████████████▋ | 153kB 7.2MB/s eta 0:00:01\r\u001b[K |██████████████████████ | 163kB 7.2MB/s eta 0:00:01\r\u001b[K |███████████████████████▍ | 174kB 7.2MB/s eta 0:00:01\r\u001b[K |████████████████████████▊ | 184kB 7.2MB/s eta 0:00:01\r\u001b[K |██████████████████████████▏ | 194kB 7.2MB/s eta 0:00:01\r\u001b[K |███████████████████████████▌ | 204kB 7.2MB/s eta 0:00:01\r\u001b[K |████████████████████████████▉ | 215kB 7.2MB/s eta 0:00:01\r\u001b[K |██████████████████████████████▎ | 225kB 7.2MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▋| 235kB 7.2MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 245kB 7.2MB/s \n", - "\u001b[?25hCollecting soundfile\n", - " Downloading https://files.pythonhosted.org/packages/eb/f2/3cbbbf3b96fb9fa91582c438b574cff3f45b29c772f94c400e2c99ef5db9/SoundFile-0.10.3.post1-py2.py3-none-any.whl\n", - "Collecting toml\n", - " Downloading https://files.pythonhosted.org/packages/9f/e1/1b40b80f2e1663a6b9f497123c11d7d988c0919abbf3c3f2688e448c5363/toml-0.10.1-py2.py3-none-any.whl\n", - "Collecting pycuda\n", - "\u001b[?25l Downloading https://files.pythonhosted.org/packages/5e/3f/5658c38579b41866ba21ee1b5020b8225cec86fe717e4b1c5c972de0a33c/pycuda-2019.1.2.tar.gz (1.6MB)\n", - "\u001b[K |████████████████████████████████| 1.6MB 13.0MB/s \n", - "\u001b[?25hRequirement already satisfied: cffi>=1.0 in /usr/local/lib/python3.6/dist-packages (from soundfile) (1.14.0)\n", - "Collecting pytools>=2011.2\n", - "\u001b[?25l Downloading https://files.pythonhosted.org/packages/56/4c/a04ed1882ae0fd756b787be4d0f15d81c137952d83cf9b991bba0bbb54ba/pytools-2020.2.tar.gz (63kB)\n", - "\u001b[K |████████████████████████████████| 71kB 8.8MB/s \n", - "\u001b[?25hRequirement already satisfied: decorator>=3.2.0 in /usr/local/lib/python3.6/dist-packages (from pycuda) (4.4.2)\n", - "Collecting appdirs>=1.4.0\n", - " Downloading https://files.pythonhosted.org/packages/3b/00/2344469e2084fb287c2e0b57b72910309874c3245463acd6cf5e3db69324/appdirs-1.4.4-py2.py3-none-any.whl\n", - "Collecting mako\n", - "\u001b[?25l Downloading https://files.pythonhosted.org/packages/a6/37/0e706200d22172eb8fa17d68a7ae22dec7631a0a92266634fb518a88a5b2/Mako-1.1.3-py2.py3-none-any.whl (75kB)\n", - "\u001b[K |████████████████████████████████| 81kB 9.6MB/s \n", - "\u001b[?25hRequirement already satisfied: pycparser in /usr/local/lib/python3.6/dist-packages (from cffi>=1.0->soundfile) (2.20)\n", - "Requirement already satisfied: six>=1.8.0 in /usr/local/lib/python3.6/dist-packages (from pytools>=2011.2->pycuda) (1.12.0)\n", - "Requirement already satisfied: numpy>=1.6.0 in /usr/local/lib/python3.6/dist-packages (from pytools>=2011.2->pycuda) (1.18.4)\n", - "Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.6/dist-packages (from mako->pycuda) (1.1.1)\n", - "Building wheels for collected packages: pycuda, pytools\n", - " Building wheel for pycuda (setup.py) ... \u001b[?25l\u001b[?25hdone\n", - " Created wheel for pycuda: filename=pycuda-2019.1.2-cp36-cp36m-linux_x86_64.whl size=4535577 sha256=59be2ae6c720d30baa53abfc01ae1e32d9b1d724a9bccb003138c7f3a400bf02\n", - " Stored in directory: /root/.cache/pip/wheels/a6/60/f0/b1c430c73d281ac3e46070480db50f7907364eb6f6d3188396\n", - " Building wheel for pytools (setup.py) ... \u001b[?25l\u001b[?25hdone\n", - " Created wheel for pytools: filename=pytools-2020.2-py2.py3-none-any.whl size=62338 sha256=38fb5f88b826536d83f0103e1ad56bed3705844d240f3989d4869537f73b7d70\n", - " Stored in directory: /root/.cache/pip/wheels/a7/d6/ac/03a67d071bde6d272d1f7c9ab7f4344fa9d7b9d98bda7fd127\n", - "Successfully built pycuda pytools\n", - "Installing collected packages: unidecode, soundfile, toml, appdirs, pytools, mako, pycuda\n", - "Successfully installed appdirs-1.4.4 mako-1.1.3 pycuda-2019.1.2 pytools-2020.2 soundfile-0.10.3.post1 toml-0.10.1 unidecode-1.1.1\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "UfMZMGbEMFXE", - "outputId": "e5d0bec2-774a-43ba-f27e-5a79e8cfd2d8", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 346 - } - }, - "source": [ - "!pip install onnx==1.5.0 onnxruntime==0.5.0 torch==1.3.0" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Requirement already satisfied: onnx==1.5.0 in /usr/local/lib/python3.6/dist-packages (1.5.0)\n", - "Requirement already satisfied: onnxruntime==0.5.0 in /usr/local/lib/python3.6/dist-packages (0.5.0)\n", - "Collecting torch==1.3.0\n", - "\u001b[?25l Downloading https://files.pythonhosted.org/packages/ae/05/50a05de5337f7a924bb8bd70c6936230642233e424d6a9747ef1cfbde353/torch-1.3.0-cp36-cp36m-manylinux1_x86_64.whl (773.1MB)\n", - "\u001b[K |████████████████████████████████| 773.1MB 23kB/s \n", - "\u001b[?25hRequirement already satisfied: protobuf in /usr/local/lib/python3.6/dist-packages (from onnx==1.5.0) (3.10.0)\n", - "Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from onnx==1.5.0) (1.12.0)\n", - "Requirement already satisfied: typing>=3.6.4 in /usr/local/lib/python3.6/dist-packages (from onnx==1.5.0) (3.6.6)\n", - "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from onnx==1.5.0) (1.18.4)\n", - "Requirement already satisfied: typing-extensions>=3.6.2.1 in /usr/local/lib/python3.6/dist-packages (from onnx==1.5.0) (3.6.6)\n", - "Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf->onnx==1.5.0) (46.4.0)\n", - "\u001b[31mERROR: torchvision 0.6.0+cu101 has requirement torch==1.5.0, but you'll have torch 1.3.0 which is incompatible.\u001b[0m\n", - "Installing collected packages: torch\n", - " Found existing installation: torch 1.5.0+cu101\n", - " Uninstalling torch-1.5.0+cu101:\n", - " Successfully uninstalled torch-1.5.0+cu101\n", - "Successfully installed torch-1.3.0\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Bup2MbIjStm2" - }, - "source": [ - "## Play with audio examples\n", - "\n", - "You can perform inference using pre-trained checkpoints which takes audio file (in .wav format) as input, and produces the corresponding text file. You can customize the content of the input .wav file. For example, there are several examples of input files at \"notebooks\" dirctory and we can listen to example1.wav:" - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "u7J2WjikSxgu", - "outputId": "b0c06bc0-ecf0-4165-e193-a7372cfbb4a8", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 61 - } - }, - "source": [ - "import IPython.display as ipd\n", - "ipd.Audio('./example1.wav', rate=22050)" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/html": [ - "\n", - " \n", - " " - ], - "text/plain": [ - "" - ] - }, - "metadata": { - "tags": [] - }, - "execution_count": 21 - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "la9uEY8ja9iV" - }, - "source": [ - "You can also download your own audio sample to Colab with\n", - "\n", - "```!wget ```" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "9Sa3IbQ3NawI", - "colab_type": "text" - }, - "source": [ - "## Modifying batchsize\n", - "\n", - "Before proceeding with TRT engine creation, we will override the default max batchsize to reduce memory usage. In the below cell, if you later run into memory issues, edit:\n", - "\n", - "```builder.max_batch_size = 16```\n", - "\n", - "to a smaller value." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "W5Nqxw8FMetE", - "colab_type": "code", - "outputId": "12c0c21d-6f32-43fb-eb08-ff03f0e315b6", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 35 - } - }, - "source": [ - "%%writefile /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/trt/trtutils.py\n", - "# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# http://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License.\n", - "'''Contains helper functions for TRT components of JASPER inference\n", - "'''\n", - "import pycuda.driver as cuda\n", - "import tensorrt as trt\n", - "import onnxruntime as ort\n", - "import numpy as np\n", - "\n", - "class HostDeviceMem(object):\n", - " '''Type for managing host and device buffers\n", - "\n", - " A simple class which is more explicit that dealing with a 2-tuple.\n", - " '''\n", - " def __init__(self, host_mem, device_mem):\n", - " self.host = host_mem\n", - " self.device = device_mem\n", - "\n", - " def __str__(self):\n", - " return \"Host:\\n\" + str(self.host) + \"\\nDevice:\\n\" + str(self.device)\n", - "\n", - " def __repr__(self):\n", - " return self.__str__()\n", - "\n", - "def build_engine_from_parser(args):\n", - " '''Builds TRT engine from an ONNX file\n", - " Note that network output 1 is unmarked so that the engine will not use\n", - " vestigial length calculations associated with masked_fill\n", - " '''\n", - " TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) if args.verbose else trt.Logger(trt.Logger.WARNING)\n", - " builder = trt.Builder(TRT_LOGGER)\n", - " builder.max_batch_size = 16\n", - "\n", - " if args.trt_fp16:\n", - " builder.fp16_mode = True\n", - " print(\"Optimizing for FP16\")\n", - " config_flags = 1 << int(trt.BuilderFlag.FP16) # | 1 << int(trt.BuilderFlag.STRICT_TYPES)\n", - " max_size = 4*1024*1024*1024\n", - " max_len = args.max_seq_len\n", - " else:\n", - " config_flags = 0\n", - " max_size = 4*1024*1024*1024\n", - " max_len = args.max_seq_len\n", - " if args.max_workspace_size > 0:\n", - " builder.max_workspace_size = args.max_workspace_size\n", - " else:\n", - " builder.max_workspace_size = max_size\n", - " \n", - " config = builder.create_builder_config()\n", - " config.flags = config_flags\n", - " \n", - " if not args.static_shape:\n", - " profile = builder.create_optimization_profile()\n", - " if args.transpose:\n", - " profile.set_shape(\"FEATURES\", min=(1,192,64), opt=(args.engine_batch_size,256,64), max=(builder.max_batch_size, max_len, 64))\n", - " else:\n", - " profile.set_shape(\"FEATURES\", min=(1,64,192), opt=(args.engine_batch_size,64,256), max=(builder.max_batch_size, 64, max_len)) \n", - " config.add_optimization_profile(profile) \n", - " explicit_batch = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)\n", - " network = builder.create_network(explicit_batch)\n", - "\n", - " with trt.OnnxParser(network, TRT_LOGGER) as parser:\n", - " with open(args.onnx_path, 'rb') as model:\n", - " parsed = parser.parse(model.read())\n", - " print (\"Parsing returned \", parsed, \"dynamic_shape= \" , not args.static_shape, \"\\n\")\n", - " return builder.build_engine(network, config=config)\n", - "\n", - "def deserialize_engine(engine_path, is_verbose):\n", - " '''Deserializes TRT engine at engine_path\n", - " '''\n", - " TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) if is_verbose else trt.Logger(trt.Logger.WARNING)\n", - " with open(engine_path, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:\n", - " engine = runtime.deserialize_cuda_engine(f.read())\n", - " return engine\n", - "\n", - "\n", - "def allocate_buffers_with_existing_inputs(context, inp):\n", - " '''\n", - " allocate_buffers() (see TRT python samples) but uses an existing inputs on device\n", - "\n", - " inp: List of pointers to device memory. Pointers are in the same order as\n", - " would be produced by allocate_buffers(). That is, inputs are in the\n", - " order defined by iterating through `engine`\n", - " '''\n", - " # Add input to bindings\n", - " bindings = [0,0]\n", - " outputs = []\n", - " engine = context.engine\n", - " batch_size = inp[0].shape\n", - " inp_idx = engine.get_binding_index(\"FEATURES\") \n", - " inp_b = inp[0].data_ptr()\n", - " assert(inp[0].is_contiguous())\n", - " bindings[inp_idx] = inp_b\n", - " sh = inp[0].shape\n", - " batch_size = sh[0]\n", - " orig_shape = context.get_binding_shape(inp_idx)\n", - " if orig_shape[0]==-1:\n", - " context.set_binding_shape(inp_idx, trt.Dims([batch_size, sh[1], sh[2]]))\n", - "\n", - " assert context.all_binding_shapes_specified\n", - "\n", - " out_idx = engine.get_binding_index(\"LOGITS\")\n", - " # Allocate output buffer by querying the size from the context. This may be different for different input shapes.\n", - " out_shape = context.get_binding_shape(out_idx)\n", - " #print (\"Out_shape: \", out_shape)\n", - " h_output = cuda.pagelocked_empty(tuple(out_shape), dtype=np.float32())\n", - " # print (\"Out bytes: \" , h_output.nbytes)\n", - " d_output = cuda.mem_alloc(h_output.nbytes)\n", - " bindings[out_idx] = int(d_output)\n", - " hdm = HostDeviceMem(h_output, d_output)\n", - " outputs.append(hdm)\n", - " return outputs, bindings, out_shape\n", - "\n", - "def get_engine(args):\n", - " '''Get a TRT engine\n", - "\n", - " If --should_serialize is present, always build from ONNX and store result in --engine_path.\n", - " Else If an engine is provided as an argument (--engine_path) use that one.\n", - " Otherwise, make one from onnx (--onnx_load_path), but don't serialize it.\n", - " '''\n", - " engine = None\n", - "\n", - " if args.engine_path is not None and args.use_existing_engine:\n", - " engine = deserialize_engine(args.engine_path, args.verbose)\n", - " elif args.engine_path is not None and args.onnx_path is not None:\n", - " # Build a new engine and serialize it.\n", - " print(\"Building TRT engine ....\") \n", - " engine = build_engine_from_parser(args)\n", - " if engine is not None:\n", - " with open(args.engine_path, 'wb') as f:\n", - " f.write(engine.serialize())\n", - " print(\"TRT engine saved at \" + args.engine_path + \" ...\") \n", - " elif args.onnx_path is not None:\n", - " ort_session = ort.InferenceSession(args.onnx_path)\n", - " return ort_session\n", - " else:\n", - " raise Exception(\"One of the following sets of arguments must be provided:\\n\"+\n", - " \" + --use_existing_engine\\n\"+\n", - " \" + \\n\"+\n", - " \"in order to construct a TRT engine\")\n", - " if engine is None:\n", - " raise Exception(\"Failed to acquire TRT engine\")\n", - "\n", - " return engine\n" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Overwriting /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/trt/trtutils.py\n" - ], - "name": "stdout" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "571hMhhGRnOB" - }, - "source": [ - "## FP32 Inference with TensorRT\n", - "\n", - "\n", - "### Creating TensorRT FP32 execution plan\n", - "\n", - "You can run inference using the trt/perf.py script:\n", - "* the checkpoint is passed as `--ckpt` argument \n", - "* `--model_toml` specifies the path to network configuration file (see examples in \"config\" directory)\n", - "* `--make_onnx` exports to ONNX file at the path if set\n", - "* `--engine_path` saves the engine file (*.plan) \n", - "\n", - "To create a new engine file (jasper.plan) for TensorRT and run it using fp32 (building the engine for the first time can take several minutes):" - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "aJrN9pmdG4C8", - "outputId": "51e11054-567c-4d83-949f-be901073eaa0", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 1000 - } - }, - "source": [ - "%%bash\n", - "export PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", - "python ../trt/perf.py \\\n", - "--ckpt_path ./jasper_fp16.pt --wav=example1.wav \\\n", - "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", - "--make_onnx --onnx_path jasper.onnx \\\n", - "--engine_path jasper.plan" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Getting component\n", - "graph(%FEATURES : Float(16, 64, 1006),\n", - " %jasper_encoder.encoder.0.conv.0.weight : Float(256, 64, 11),\n", - " %jasper_encoder.encoder.0.conv.1.weight : Float(256),\n", - " %jasper_encoder.encoder.0.conv.1.bias : Float(256),\n", - " %jasper_encoder.encoder.0.conv.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.0.conv.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.0.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.1.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.1.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.4.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.5.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.5.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.5.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.5.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.8.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.9.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.9.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.9.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.9.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.12.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.13.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.13.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.13.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.13.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.16.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.17.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.17.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.17.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.17.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.0.weight : Float(256, 256, 1),\n", - " %jasper_encoder.encoder.1.res.0.1.weight : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.1.bias : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.0.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.1.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.1.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.4.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.5.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.5.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.5.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.5.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.8.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.9.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.9.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.9.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.9.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.12.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.13.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.13.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.13.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.13.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.16.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.17.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.17.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.17.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.17.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.0.weight : Float(256, 256, 1),\n", - " %jasper_encoder.encoder.2.res.0.1.weight : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.1.bias : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.0.weight : Float(256, 256, 1),\n", - " %jasper_encoder.encoder.2.res.1.1.weight : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.1.bias : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.3.conv.0.weight : Float(384, 256, 13),\n", - " %jasper_encoder.encoder.3.conv.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.4.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.5.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.5.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.5.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.5.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.8.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.9.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.9.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.9.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.9.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.12.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.13.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.13.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.13.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.13.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.16.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.17.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.17.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.17.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.17.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.3.res.0.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.3.res.1.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.3.res.2.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.0.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.4.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.5.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.5.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.5.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.5.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.8.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.9.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.9.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.9.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.9.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.12.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.13.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.13.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.13.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.13.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.16.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.17.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.17.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.17.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.17.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.4.res.0.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.4.res.1.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.4.res.2.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.0.weight : Float(384, 384, 1),\n", - " %jasper_encoder.encoder.4.res.3.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.5.conv.0.weight : Float(512, 384, 17),\n", - " %jasper_encoder.encoder.5.conv.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.4.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.5.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.5.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.5.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.5.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.8.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.9.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.9.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.9.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.9.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.12.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.13.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.13.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.13.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.13.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.16.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.17.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.17.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.17.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.17.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.5.res.0.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.5.res.1.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.5.res.2.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.5.res.3.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.5.res.4.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.0.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.4.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.5.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.5.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.5.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.5.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.8.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.9.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.9.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.9.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.9.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.12.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.13.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.13.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.13.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.13.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.16.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.17.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.17.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.17.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.17.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.6.res.0.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.6.res.1.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.6.res.2.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.6.res.3.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.6.res.4.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.0.weight : Float(512, 512, 1),\n", - " %jasper_encoder.encoder.6.res.5.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.7.conv.0.weight : Float(640, 512, 21),\n", - " %jasper_encoder.encoder.7.conv.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.4.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.5.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.5.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.5.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.5.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.8.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.9.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.9.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.9.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.9.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.12.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.13.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.13.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.13.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.13.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.16.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.17.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.17.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.17.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.17.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.7.res.0.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.7.res.1.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.7.res.2.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.7.res.3.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.7.res.4.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.7.res.5.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.7.res.6.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.0.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.4.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.5.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.5.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.5.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.5.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.8.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.9.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.9.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.9.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.9.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.12.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.13.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.13.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.13.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.13.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.16.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.17.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.17.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.17.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.17.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.8.res.0.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.8.res.1.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.8.res.2.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.8.res.3.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.8.res.4.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.8.res.5.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.8.res.6.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.0.weight : Float(640, 640, 1),\n", - " %jasper_encoder.encoder.8.res.7.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.9.conv.0.weight : Float(768, 640, 25),\n", - " %jasper_encoder.encoder.9.conv.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.4.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.5.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.5.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.5.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.5.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.8.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.9.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.9.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.9.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.9.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.12.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.13.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.13.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.13.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.13.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.16.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.17.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.17.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.17.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.17.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.9.res.0.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.9.res.1.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.9.res.2.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.9.res.3.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.9.res.4.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.9.res.5.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.9.res.6.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.9.res.7.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.9.res.8.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.0.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.4.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.5.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.5.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.5.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.5.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.8.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.9.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.9.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.9.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.9.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.12.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.13.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.13.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.13.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.13.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.16.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.17.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.17.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.17.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.17.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.10.res.0.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.10.res.1.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.10.res.2.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.10.res.3.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.10.res.4.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.10.res.5.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.10.res.6.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.10.res.7.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.10.res.8.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.0.weight : Float(768, 768, 1),\n", - " %jasper_encoder.encoder.10.res.9.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.11.conv.0.weight : Float(896, 768, 29),\n", - " %jasper_encoder.encoder.11.conv.1.weight : Float(896),\n", - " %jasper_encoder.encoder.11.conv.1.bias : Float(896),\n", - " %jasper_encoder.encoder.11.conv.1.running_mean : Float(896),\n", - " %jasper_encoder.encoder.11.conv.1.running_var : Float(896),\n", - " %jasper_encoder.encoder.12.conv.0.weight : Float(1024, 896, 1),\n", - " %jasper_encoder.encoder.12.conv.1.weight : Float(1024),\n", - " %jasper_encoder.encoder.12.conv.1.bias : Float(1024),\n", - " %jasper_encoder.encoder.12.conv.1.running_mean : Float(1024),\n", - " %jasper_encoder.encoder.12.conv.1.running_var : Float(1024),\n", - " %jasper_decoder.decoder_layers.0.weight : Float(29, 1024, 1),\n", - " %jasper_decoder.decoder_layers.0.bias : Float(29)):\n", - " %651 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[2]](%FEATURES, %jasper_encoder.encoder.0.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[0]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %652 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%651, %jasper_encoder.encoder.0.conv.1.weight, %jasper_encoder.encoder.0.conv.1.bias, %jasper_encoder.encoder.0.conv.1.running_mean, %jasper_encoder.encoder.0.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[0]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %653 : Float(16, 256, 503) = onnx::Relu(%652), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[0]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %654 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%653, %jasper_encoder.encoder.1.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %655 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%654, %jasper_encoder.encoder.1.conv.1.weight, %jasper_encoder.encoder.1.conv.1.bias, %jasper_encoder.encoder.1.conv.1.running_mean, %jasper_encoder.encoder.1.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %656 : Float(16, 256, 503) = onnx::Relu(%655), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %657 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%656, %jasper_encoder.encoder.1.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %658 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%657, %jasper_encoder.encoder.1.conv.5.weight, %jasper_encoder.encoder.1.conv.5.bias, %jasper_encoder.encoder.1.conv.5.running_mean, %jasper_encoder.encoder.1.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %659 : Float(16, 256, 503) = onnx::Relu(%658), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %660 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%659, %jasper_encoder.encoder.1.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %661 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%660, %jasper_encoder.encoder.1.conv.9.weight, %jasper_encoder.encoder.1.conv.9.bias, %jasper_encoder.encoder.1.conv.9.running_mean, %jasper_encoder.encoder.1.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %662 : Float(16, 256, 503) = onnx::Relu(%661), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %663 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%662, %jasper_encoder.encoder.1.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %664 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%663, %jasper_encoder.encoder.1.conv.13.weight, %jasper_encoder.encoder.1.conv.13.bias, %jasper_encoder.encoder.1.conv.13.running_mean, %jasper_encoder.encoder.1.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %665 : Float(16, 256, 503) = onnx::Relu(%664), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %666 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%665, %jasper_encoder.encoder.1.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %667 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%666, %jasper_encoder.encoder.1.conv.17.weight, %jasper_encoder.encoder.1.conv.17.bias, %jasper_encoder.encoder.1.conv.17.running_mean, %jasper_encoder.encoder.1.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %668 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.1.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %669 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%668, %jasper_encoder.encoder.1.res.0.1.weight, %jasper_encoder.encoder.1.res.0.1.bias, %jasper_encoder.encoder.1.res.0.1.running_mean, %jasper_encoder.encoder.1.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %670 : Float(16, 256, 503) = onnx::Add(%667, %669), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %671 : Float(16, 256, 503) = onnx::Relu(%670), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %672 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%671, %jasper_encoder.encoder.2.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %673 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%672, %jasper_encoder.encoder.2.conv.1.weight, %jasper_encoder.encoder.2.conv.1.bias, %jasper_encoder.encoder.2.conv.1.running_mean, %jasper_encoder.encoder.2.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %674 : Float(16, 256, 503) = onnx::Relu(%673), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %675 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%674, %jasper_encoder.encoder.2.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %676 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%675, %jasper_encoder.encoder.2.conv.5.weight, %jasper_encoder.encoder.2.conv.5.bias, %jasper_encoder.encoder.2.conv.5.running_mean, %jasper_encoder.encoder.2.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %677 : Float(16, 256, 503) = onnx::Relu(%676), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %678 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%677, %jasper_encoder.encoder.2.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %679 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%678, %jasper_encoder.encoder.2.conv.9.weight, %jasper_encoder.encoder.2.conv.9.bias, %jasper_encoder.encoder.2.conv.9.running_mean, %jasper_encoder.encoder.2.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %680 : Float(16, 256, 503) = onnx::Relu(%679), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %681 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%680, %jasper_encoder.encoder.2.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %682 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%681, %jasper_encoder.encoder.2.conv.13.weight, %jasper_encoder.encoder.2.conv.13.bias, %jasper_encoder.encoder.2.conv.13.running_mean, %jasper_encoder.encoder.2.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %683 : Float(16, 256, 503) = onnx::Relu(%682), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %684 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%683, %jasper_encoder.encoder.2.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %685 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%684, %jasper_encoder.encoder.2.conv.17.weight, %jasper_encoder.encoder.2.conv.17.bias, %jasper_encoder.encoder.2.conv.17.running_mean, %jasper_encoder.encoder.2.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %686 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.2.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %687 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%686, %jasper_encoder.encoder.2.res.0.1.weight, %jasper_encoder.encoder.2.res.0.1.bias, %jasper_encoder.encoder.2.res.0.1.running_mean, %jasper_encoder.encoder.2.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %688 : Float(16, 256, 503) = onnx::Add(%685, %687), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %689 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.2.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %690 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%689, %jasper_encoder.encoder.2.res.1.1.weight, %jasper_encoder.encoder.2.res.1.1.bias, %jasper_encoder.encoder.2.res.1.1.running_mean, %jasper_encoder.encoder.2.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %691 : Float(16, 256, 503) = onnx::Add(%688, %690), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %692 : Float(16, 256, 503) = onnx::Relu(%691), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %693 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%692, %jasper_encoder.encoder.3.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %694 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%693, %jasper_encoder.encoder.3.conv.1.weight, %jasper_encoder.encoder.3.conv.1.bias, %jasper_encoder.encoder.3.conv.1.running_mean, %jasper_encoder.encoder.3.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %695 : Float(16, 384, 503) = onnx::Relu(%694), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %696 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%695, %jasper_encoder.encoder.3.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %697 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%696, %jasper_encoder.encoder.3.conv.5.weight, %jasper_encoder.encoder.3.conv.5.bias, %jasper_encoder.encoder.3.conv.5.running_mean, %jasper_encoder.encoder.3.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %698 : Float(16, 384, 503) = onnx::Relu(%697), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %699 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%698, %jasper_encoder.encoder.3.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %700 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%699, %jasper_encoder.encoder.3.conv.9.weight, %jasper_encoder.encoder.3.conv.9.bias, %jasper_encoder.encoder.3.conv.9.running_mean, %jasper_encoder.encoder.3.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %701 : Float(16, 384, 503) = onnx::Relu(%700), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %702 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%701, %jasper_encoder.encoder.3.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %703 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%702, %jasper_encoder.encoder.3.conv.13.weight, %jasper_encoder.encoder.3.conv.13.bias, %jasper_encoder.encoder.3.conv.13.running_mean, %jasper_encoder.encoder.3.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %704 : Float(16, 384, 503) = onnx::Relu(%703), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %705 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%704, %jasper_encoder.encoder.3.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %706 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%705, %jasper_encoder.encoder.3.conv.17.weight, %jasper_encoder.encoder.3.conv.17.bias, %jasper_encoder.encoder.3.conv.17.running_mean, %jasper_encoder.encoder.3.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %707 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.3.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %708 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%707, %jasper_encoder.encoder.3.res.0.1.weight, %jasper_encoder.encoder.3.res.0.1.bias, %jasper_encoder.encoder.3.res.0.1.running_mean, %jasper_encoder.encoder.3.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %709 : Float(16, 384, 503) = onnx::Add(%706, %708), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %710 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.3.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %711 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%710, %jasper_encoder.encoder.3.res.1.1.weight, %jasper_encoder.encoder.3.res.1.1.bias, %jasper_encoder.encoder.3.res.1.1.running_mean, %jasper_encoder.encoder.3.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %712 : Float(16, 384, 503) = onnx::Add(%709, %711), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %713 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.3.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %714 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%713, %jasper_encoder.encoder.3.res.2.1.weight, %jasper_encoder.encoder.3.res.2.1.bias, %jasper_encoder.encoder.3.res.2.1.running_mean, %jasper_encoder.encoder.3.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %715 : Float(16, 384, 503) = onnx::Add(%712, %714), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %716 : Float(16, 384, 503) = onnx::Relu(%715), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %717 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%716, %jasper_encoder.encoder.4.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %718 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%717, %jasper_encoder.encoder.4.conv.1.weight, %jasper_encoder.encoder.4.conv.1.bias, %jasper_encoder.encoder.4.conv.1.running_mean, %jasper_encoder.encoder.4.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %719 : Float(16, 384, 503) = onnx::Relu(%718), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %720 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%719, %jasper_encoder.encoder.4.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %721 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%720, %jasper_encoder.encoder.4.conv.5.weight, %jasper_encoder.encoder.4.conv.5.bias, %jasper_encoder.encoder.4.conv.5.running_mean, %jasper_encoder.encoder.4.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %722 : Float(16, 384, 503) = onnx::Relu(%721), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %723 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%722, %jasper_encoder.encoder.4.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %724 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%723, %jasper_encoder.encoder.4.conv.9.weight, %jasper_encoder.encoder.4.conv.9.bias, %jasper_encoder.encoder.4.conv.9.running_mean, %jasper_encoder.encoder.4.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %725 : Float(16, 384, 503) = onnx::Relu(%724), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %726 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%725, %jasper_encoder.encoder.4.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %727 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%726, %jasper_encoder.encoder.4.conv.13.weight, %jasper_encoder.encoder.4.conv.13.bias, %jasper_encoder.encoder.4.conv.13.running_mean, %jasper_encoder.encoder.4.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %728 : Float(16, 384, 503) = onnx::Relu(%727), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %729 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%728, %jasper_encoder.encoder.4.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %730 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%729, %jasper_encoder.encoder.4.conv.17.weight, %jasper_encoder.encoder.4.conv.17.bias, %jasper_encoder.encoder.4.conv.17.running_mean, %jasper_encoder.encoder.4.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %731 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.4.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %732 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%731, %jasper_encoder.encoder.4.res.0.1.weight, %jasper_encoder.encoder.4.res.0.1.bias, %jasper_encoder.encoder.4.res.0.1.running_mean, %jasper_encoder.encoder.4.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %733 : Float(16, 384, 503) = onnx::Add(%730, %732), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %734 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.4.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %735 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%734, %jasper_encoder.encoder.4.res.1.1.weight, %jasper_encoder.encoder.4.res.1.1.bias, %jasper_encoder.encoder.4.res.1.1.running_mean, %jasper_encoder.encoder.4.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %736 : Float(16, 384, 503) = onnx::Add(%733, %735), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %737 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.4.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %738 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%737, %jasper_encoder.encoder.4.res.2.1.weight, %jasper_encoder.encoder.4.res.2.1.bias, %jasper_encoder.encoder.4.res.2.1.running_mean, %jasper_encoder.encoder.4.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %739 : Float(16, 384, 503) = onnx::Add(%736, %738), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %740 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.4.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %741 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%740, %jasper_encoder.encoder.4.res.3.1.weight, %jasper_encoder.encoder.4.res.3.1.bias, %jasper_encoder.encoder.4.res.3.1.running_mean, %jasper_encoder.encoder.4.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %742 : Float(16, 384, 503) = onnx::Add(%739, %741), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %743 : Float(16, 384, 503) = onnx::Relu(%742), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %744 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%743, %jasper_encoder.encoder.5.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %745 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%744, %jasper_encoder.encoder.5.conv.1.weight, %jasper_encoder.encoder.5.conv.1.bias, %jasper_encoder.encoder.5.conv.1.running_mean, %jasper_encoder.encoder.5.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %746 : Float(16, 512, 503) = onnx::Relu(%745), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %747 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%746, %jasper_encoder.encoder.5.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %748 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%747, %jasper_encoder.encoder.5.conv.5.weight, %jasper_encoder.encoder.5.conv.5.bias, %jasper_encoder.encoder.5.conv.5.running_mean, %jasper_encoder.encoder.5.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %749 : Float(16, 512, 503) = onnx::Relu(%748), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %750 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%749, %jasper_encoder.encoder.5.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %751 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%750, %jasper_encoder.encoder.5.conv.9.weight, %jasper_encoder.encoder.5.conv.9.bias, %jasper_encoder.encoder.5.conv.9.running_mean, %jasper_encoder.encoder.5.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %752 : Float(16, 512, 503) = onnx::Relu(%751), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %753 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%752, %jasper_encoder.encoder.5.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %754 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%753, %jasper_encoder.encoder.5.conv.13.weight, %jasper_encoder.encoder.5.conv.13.bias, %jasper_encoder.encoder.5.conv.13.running_mean, %jasper_encoder.encoder.5.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %755 : Float(16, 512, 503) = onnx::Relu(%754), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %756 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%755, %jasper_encoder.encoder.5.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %757 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%756, %jasper_encoder.encoder.5.conv.17.weight, %jasper_encoder.encoder.5.conv.17.bias, %jasper_encoder.encoder.5.conv.17.running_mean, %jasper_encoder.encoder.5.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %758 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.5.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %759 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%758, %jasper_encoder.encoder.5.res.0.1.weight, %jasper_encoder.encoder.5.res.0.1.bias, %jasper_encoder.encoder.5.res.0.1.running_mean, %jasper_encoder.encoder.5.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %760 : Float(16, 512, 503) = onnx::Add(%757, %759), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %761 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.5.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %762 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%761, %jasper_encoder.encoder.5.res.1.1.weight, %jasper_encoder.encoder.5.res.1.1.bias, %jasper_encoder.encoder.5.res.1.1.running_mean, %jasper_encoder.encoder.5.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %763 : Float(16, 512, 503) = onnx::Add(%760, %762), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %764 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.5.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %765 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%764, %jasper_encoder.encoder.5.res.2.1.weight, %jasper_encoder.encoder.5.res.2.1.bias, %jasper_encoder.encoder.5.res.2.1.running_mean, %jasper_encoder.encoder.5.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %766 : Float(16, 512, 503) = onnx::Add(%763, %765), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %767 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.5.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %768 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%767, %jasper_encoder.encoder.5.res.3.1.weight, %jasper_encoder.encoder.5.res.3.1.bias, %jasper_encoder.encoder.5.res.3.1.running_mean, %jasper_encoder.encoder.5.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %769 : Float(16, 512, 503) = onnx::Add(%766, %768), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %770 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.5.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %771 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%770, %jasper_encoder.encoder.5.res.4.1.weight, %jasper_encoder.encoder.5.res.4.1.bias, %jasper_encoder.encoder.5.res.4.1.running_mean, %jasper_encoder.encoder.5.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %772 : Float(16, 512, 503) = onnx::Add(%769, %771), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %773 : Float(16, 512, 503) = onnx::Relu(%772), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %774 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%773, %jasper_encoder.encoder.6.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %775 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%774, %jasper_encoder.encoder.6.conv.1.weight, %jasper_encoder.encoder.6.conv.1.bias, %jasper_encoder.encoder.6.conv.1.running_mean, %jasper_encoder.encoder.6.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %776 : Float(16, 512, 503) = onnx::Relu(%775), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %777 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%776, %jasper_encoder.encoder.6.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %778 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%777, %jasper_encoder.encoder.6.conv.5.weight, %jasper_encoder.encoder.6.conv.5.bias, %jasper_encoder.encoder.6.conv.5.running_mean, %jasper_encoder.encoder.6.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %779 : Float(16, 512, 503) = onnx::Relu(%778), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %780 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%779, %jasper_encoder.encoder.6.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %781 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%780, %jasper_encoder.encoder.6.conv.9.weight, %jasper_encoder.encoder.6.conv.9.bias, %jasper_encoder.encoder.6.conv.9.running_mean, %jasper_encoder.encoder.6.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %782 : Float(16, 512, 503) = onnx::Relu(%781), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %783 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%782, %jasper_encoder.encoder.6.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %784 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%783, %jasper_encoder.encoder.6.conv.13.weight, %jasper_encoder.encoder.6.conv.13.bias, %jasper_encoder.encoder.6.conv.13.running_mean, %jasper_encoder.encoder.6.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %785 : Float(16, 512, 503) = onnx::Relu(%784), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %786 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%785, %jasper_encoder.encoder.6.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %787 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%786, %jasper_encoder.encoder.6.conv.17.weight, %jasper_encoder.encoder.6.conv.17.bias, %jasper_encoder.encoder.6.conv.17.running_mean, %jasper_encoder.encoder.6.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %788 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.6.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %789 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%788, %jasper_encoder.encoder.6.res.0.1.weight, %jasper_encoder.encoder.6.res.0.1.bias, %jasper_encoder.encoder.6.res.0.1.running_mean, %jasper_encoder.encoder.6.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %790 : Float(16, 512, 503) = onnx::Add(%787, %789), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %791 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.6.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %792 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%791, %jasper_encoder.encoder.6.res.1.1.weight, %jasper_encoder.encoder.6.res.1.1.bias, %jasper_encoder.encoder.6.res.1.1.running_mean, %jasper_encoder.encoder.6.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %793 : Float(16, 512, 503) = onnx::Add(%790, %792), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %794 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.6.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %795 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%794, %jasper_encoder.encoder.6.res.2.1.weight, %jasper_encoder.encoder.6.res.2.1.bias, %jasper_encoder.encoder.6.res.2.1.running_mean, %jasper_encoder.encoder.6.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %796 : Float(16, 512, 503) = onnx::Add(%793, %795), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %797 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.6.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %798 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%797, %jasper_encoder.encoder.6.res.3.1.weight, %jasper_encoder.encoder.6.res.3.1.bias, %jasper_encoder.encoder.6.res.3.1.running_mean, %jasper_encoder.encoder.6.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %799 : Float(16, 512, 503) = onnx::Add(%796, %798), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %800 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.6.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %801 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%800, %jasper_encoder.encoder.6.res.4.1.weight, %jasper_encoder.encoder.6.res.4.1.bias, %jasper_encoder.encoder.6.res.4.1.running_mean, %jasper_encoder.encoder.6.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %802 : Float(16, 512, 503) = onnx::Add(%799, %801), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %803 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.6.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %804 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%803, %jasper_encoder.encoder.6.res.5.1.weight, %jasper_encoder.encoder.6.res.5.1.bias, %jasper_encoder.encoder.6.res.5.1.running_mean, %jasper_encoder.encoder.6.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %805 : Float(16, 512, 503) = onnx::Add(%802, %804), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %806 : Float(16, 512, 503) = onnx::Relu(%805), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %807 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%806, %jasper_encoder.encoder.7.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %808 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%807, %jasper_encoder.encoder.7.conv.1.weight, %jasper_encoder.encoder.7.conv.1.bias, %jasper_encoder.encoder.7.conv.1.running_mean, %jasper_encoder.encoder.7.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %809 : Float(16, 640, 503) = onnx::Relu(%808), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %810 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%809, %jasper_encoder.encoder.7.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %811 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%810, %jasper_encoder.encoder.7.conv.5.weight, %jasper_encoder.encoder.7.conv.5.bias, %jasper_encoder.encoder.7.conv.5.running_mean, %jasper_encoder.encoder.7.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %812 : Float(16, 640, 503) = onnx::Relu(%811), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %813 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%812, %jasper_encoder.encoder.7.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %814 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%813, %jasper_encoder.encoder.7.conv.9.weight, %jasper_encoder.encoder.7.conv.9.bias, %jasper_encoder.encoder.7.conv.9.running_mean, %jasper_encoder.encoder.7.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %815 : Float(16, 640, 503) = onnx::Relu(%814), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %816 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%815, %jasper_encoder.encoder.7.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %817 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%816, %jasper_encoder.encoder.7.conv.13.weight, %jasper_encoder.encoder.7.conv.13.bias, %jasper_encoder.encoder.7.conv.13.running_mean, %jasper_encoder.encoder.7.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %818 : Float(16, 640, 503) = onnx::Relu(%817), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %819 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%818, %jasper_encoder.encoder.7.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %820 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%819, %jasper_encoder.encoder.7.conv.17.weight, %jasper_encoder.encoder.7.conv.17.bias, %jasper_encoder.encoder.7.conv.17.running_mean, %jasper_encoder.encoder.7.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %821 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.7.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %822 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%821, %jasper_encoder.encoder.7.res.0.1.weight, %jasper_encoder.encoder.7.res.0.1.bias, %jasper_encoder.encoder.7.res.0.1.running_mean, %jasper_encoder.encoder.7.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %823 : Float(16, 640, 503) = onnx::Add(%820, %822), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %824 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.7.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %825 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%824, %jasper_encoder.encoder.7.res.1.1.weight, %jasper_encoder.encoder.7.res.1.1.bias, %jasper_encoder.encoder.7.res.1.1.running_mean, %jasper_encoder.encoder.7.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %826 : Float(16, 640, 503) = onnx::Add(%823, %825), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %827 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.7.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %828 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%827, %jasper_encoder.encoder.7.res.2.1.weight, %jasper_encoder.encoder.7.res.2.1.bias, %jasper_encoder.encoder.7.res.2.1.running_mean, %jasper_encoder.encoder.7.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %829 : Float(16, 640, 503) = onnx::Add(%826, %828), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %830 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.7.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %831 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%830, %jasper_encoder.encoder.7.res.3.1.weight, %jasper_encoder.encoder.7.res.3.1.bias, %jasper_encoder.encoder.7.res.3.1.running_mean, %jasper_encoder.encoder.7.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %832 : Float(16, 640, 503) = onnx::Add(%829, %831), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %833 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.7.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %834 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%833, %jasper_encoder.encoder.7.res.4.1.weight, %jasper_encoder.encoder.7.res.4.1.bias, %jasper_encoder.encoder.7.res.4.1.running_mean, %jasper_encoder.encoder.7.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %835 : Float(16, 640, 503) = onnx::Add(%832, %834), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %836 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.7.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %837 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%836, %jasper_encoder.encoder.7.res.5.1.weight, %jasper_encoder.encoder.7.res.5.1.bias, %jasper_encoder.encoder.7.res.5.1.running_mean, %jasper_encoder.encoder.7.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %838 : Float(16, 640, 503) = onnx::Add(%835, %837), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %839 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.7.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %840 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%839, %jasper_encoder.encoder.7.res.6.1.weight, %jasper_encoder.encoder.7.res.6.1.bias, %jasper_encoder.encoder.7.res.6.1.running_mean, %jasper_encoder.encoder.7.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %841 : Float(16, 640, 503) = onnx::Add(%838, %840), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %842 : Float(16, 640, 503) = onnx::Relu(%841), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %843 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%842, %jasper_encoder.encoder.8.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %844 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%843, %jasper_encoder.encoder.8.conv.1.weight, %jasper_encoder.encoder.8.conv.1.bias, %jasper_encoder.encoder.8.conv.1.running_mean, %jasper_encoder.encoder.8.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %845 : Float(16, 640, 503) = onnx::Relu(%844), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %846 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%845, %jasper_encoder.encoder.8.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %847 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%846, %jasper_encoder.encoder.8.conv.5.weight, %jasper_encoder.encoder.8.conv.5.bias, %jasper_encoder.encoder.8.conv.5.running_mean, %jasper_encoder.encoder.8.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %848 : Float(16, 640, 503) = onnx::Relu(%847), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %849 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%848, %jasper_encoder.encoder.8.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %850 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%849, %jasper_encoder.encoder.8.conv.9.weight, %jasper_encoder.encoder.8.conv.9.bias, %jasper_encoder.encoder.8.conv.9.running_mean, %jasper_encoder.encoder.8.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %851 : Float(16, 640, 503) = onnx::Relu(%850), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %852 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%851, %jasper_encoder.encoder.8.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %853 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%852, %jasper_encoder.encoder.8.conv.13.weight, %jasper_encoder.encoder.8.conv.13.bias, %jasper_encoder.encoder.8.conv.13.running_mean, %jasper_encoder.encoder.8.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %854 : Float(16, 640, 503) = onnx::Relu(%853), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %855 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%854, %jasper_encoder.encoder.8.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %856 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%855, %jasper_encoder.encoder.8.conv.17.weight, %jasper_encoder.encoder.8.conv.17.bias, %jasper_encoder.encoder.8.conv.17.running_mean, %jasper_encoder.encoder.8.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %857 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.8.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %858 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%857, %jasper_encoder.encoder.8.res.0.1.weight, %jasper_encoder.encoder.8.res.0.1.bias, %jasper_encoder.encoder.8.res.0.1.running_mean, %jasper_encoder.encoder.8.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %859 : Float(16, 640, 503) = onnx::Add(%856, %858), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %860 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.8.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %861 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%860, %jasper_encoder.encoder.8.res.1.1.weight, %jasper_encoder.encoder.8.res.1.1.bias, %jasper_encoder.encoder.8.res.1.1.running_mean, %jasper_encoder.encoder.8.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %862 : Float(16, 640, 503) = onnx::Add(%859, %861), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %863 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.8.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %864 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%863, %jasper_encoder.encoder.8.res.2.1.weight, %jasper_encoder.encoder.8.res.2.1.bias, %jasper_encoder.encoder.8.res.2.1.running_mean, %jasper_encoder.encoder.8.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %865 : Float(16, 640, 503) = onnx::Add(%862, %864), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %866 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.8.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %867 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%866, %jasper_encoder.encoder.8.res.3.1.weight, %jasper_encoder.encoder.8.res.3.1.bias, %jasper_encoder.encoder.8.res.3.1.running_mean, %jasper_encoder.encoder.8.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %868 : Float(16, 640, 503) = onnx::Add(%865, %867), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %869 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.8.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %870 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%869, %jasper_encoder.encoder.8.res.4.1.weight, %jasper_encoder.encoder.8.res.4.1.bias, %jasper_encoder.encoder.8.res.4.1.running_mean, %jasper_encoder.encoder.8.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %871 : Float(16, 640, 503) = onnx::Add(%868, %870), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %872 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.8.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %873 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%872, %jasper_encoder.encoder.8.res.5.1.weight, %jasper_encoder.encoder.8.res.5.1.bias, %jasper_encoder.encoder.8.res.5.1.running_mean, %jasper_encoder.encoder.8.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %874 : Float(16, 640, 503) = onnx::Add(%871, %873), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %875 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.8.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %876 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%875, %jasper_encoder.encoder.8.res.6.1.weight, %jasper_encoder.encoder.8.res.6.1.bias, %jasper_encoder.encoder.8.res.6.1.running_mean, %jasper_encoder.encoder.8.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %877 : Float(16, 640, 503) = onnx::Add(%874, %876), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %878 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%842, %jasper_encoder.encoder.8.res.7.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %879 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%878, %jasper_encoder.encoder.8.res.7.1.weight, %jasper_encoder.encoder.8.res.7.1.bias, %jasper_encoder.encoder.8.res.7.1.running_mean, %jasper_encoder.encoder.8.res.7.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %880 : Float(16, 640, 503) = onnx::Add(%877, %879), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %881 : Float(16, 640, 503) = onnx::Relu(%880), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %882 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%881, %jasper_encoder.encoder.9.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %883 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%882, %jasper_encoder.encoder.9.conv.1.weight, %jasper_encoder.encoder.9.conv.1.bias, %jasper_encoder.encoder.9.conv.1.running_mean, %jasper_encoder.encoder.9.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %884 : Float(16, 768, 503) = onnx::Relu(%883), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %885 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%884, %jasper_encoder.encoder.9.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %886 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%885, %jasper_encoder.encoder.9.conv.5.weight, %jasper_encoder.encoder.9.conv.5.bias, %jasper_encoder.encoder.9.conv.5.running_mean, %jasper_encoder.encoder.9.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %887 : Float(16, 768, 503) = onnx::Relu(%886), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %888 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%887, %jasper_encoder.encoder.9.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %889 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%888, %jasper_encoder.encoder.9.conv.9.weight, %jasper_encoder.encoder.9.conv.9.bias, %jasper_encoder.encoder.9.conv.9.running_mean, %jasper_encoder.encoder.9.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %890 : Float(16, 768, 503) = onnx::Relu(%889), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %891 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%890, %jasper_encoder.encoder.9.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %892 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%891, %jasper_encoder.encoder.9.conv.13.weight, %jasper_encoder.encoder.9.conv.13.bias, %jasper_encoder.encoder.9.conv.13.running_mean, %jasper_encoder.encoder.9.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %893 : Float(16, 768, 503) = onnx::Relu(%892), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %894 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%893, %jasper_encoder.encoder.9.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %895 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%894, %jasper_encoder.encoder.9.conv.17.weight, %jasper_encoder.encoder.9.conv.17.bias, %jasper_encoder.encoder.9.conv.17.running_mean, %jasper_encoder.encoder.9.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %896 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.9.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %897 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%896, %jasper_encoder.encoder.9.res.0.1.weight, %jasper_encoder.encoder.9.res.0.1.bias, %jasper_encoder.encoder.9.res.0.1.running_mean, %jasper_encoder.encoder.9.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %898 : Float(16, 768, 503) = onnx::Add(%895, %897), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %899 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.9.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %900 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%899, %jasper_encoder.encoder.9.res.1.1.weight, %jasper_encoder.encoder.9.res.1.1.bias, %jasper_encoder.encoder.9.res.1.1.running_mean, %jasper_encoder.encoder.9.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %901 : Float(16, 768, 503) = onnx::Add(%898, %900), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %902 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.9.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %903 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%902, %jasper_encoder.encoder.9.res.2.1.weight, %jasper_encoder.encoder.9.res.2.1.bias, %jasper_encoder.encoder.9.res.2.1.running_mean, %jasper_encoder.encoder.9.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %904 : Float(16, 768, 503) = onnx::Add(%901, %903), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %905 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.9.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %906 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%905, %jasper_encoder.encoder.9.res.3.1.weight, %jasper_encoder.encoder.9.res.3.1.bias, %jasper_encoder.encoder.9.res.3.1.running_mean, %jasper_encoder.encoder.9.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %907 : Float(16, 768, 503) = onnx::Add(%904, %906), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %908 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.9.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %909 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%908, %jasper_encoder.encoder.9.res.4.1.weight, %jasper_encoder.encoder.9.res.4.1.bias, %jasper_encoder.encoder.9.res.4.1.running_mean, %jasper_encoder.encoder.9.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %910 : Float(16, 768, 503) = onnx::Add(%907, %909), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %911 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.9.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %912 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%911, %jasper_encoder.encoder.9.res.5.1.weight, %jasper_encoder.encoder.9.res.5.1.bias, %jasper_encoder.encoder.9.res.5.1.running_mean, %jasper_encoder.encoder.9.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %913 : Float(16, 768, 503) = onnx::Add(%910, %912), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %914 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.9.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %915 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%914, %jasper_encoder.encoder.9.res.6.1.weight, %jasper_encoder.encoder.9.res.6.1.bias, %jasper_encoder.encoder.9.res.6.1.running_mean, %jasper_encoder.encoder.9.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %916 : Float(16, 768, 503) = onnx::Add(%913, %915), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %917 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%842, %jasper_encoder.encoder.9.res.7.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %918 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%917, %jasper_encoder.encoder.9.res.7.1.weight, %jasper_encoder.encoder.9.res.7.1.bias, %jasper_encoder.encoder.9.res.7.1.running_mean, %jasper_encoder.encoder.9.res.7.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %919 : Float(16, 768, 503) = onnx::Add(%916, %918), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %920 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%881, %jasper_encoder.encoder.9.res.8.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %921 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%920, %jasper_encoder.encoder.9.res.8.1.weight, %jasper_encoder.encoder.9.res.8.1.bias, %jasper_encoder.encoder.9.res.8.1.running_mean, %jasper_encoder.encoder.9.res.8.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %922 : Float(16, 768, 503) = onnx::Add(%919, %921), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %923 : Float(16, 768, 503) = onnx::Relu(%922), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %924 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%923, %jasper_encoder.encoder.10.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %925 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%924, %jasper_encoder.encoder.10.conv.1.weight, %jasper_encoder.encoder.10.conv.1.bias, %jasper_encoder.encoder.10.conv.1.running_mean, %jasper_encoder.encoder.10.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %926 : Float(16, 768, 503) = onnx::Relu(%925), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %927 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%926, %jasper_encoder.encoder.10.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %928 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%927, %jasper_encoder.encoder.10.conv.5.weight, %jasper_encoder.encoder.10.conv.5.bias, %jasper_encoder.encoder.10.conv.5.running_mean, %jasper_encoder.encoder.10.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %929 : Float(16, 768, 503) = onnx::Relu(%928), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %930 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%929, %jasper_encoder.encoder.10.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %931 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%930, %jasper_encoder.encoder.10.conv.9.weight, %jasper_encoder.encoder.10.conv.9.bias, %jasper_encoder.encoder.10.conv.9.running_mean, %jasper_encoder.encoder.10.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %932 : Float(16, 768, 503) = onnx::Relu(%931), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %933 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%932, %jasper_encoder.encoder.10.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %934 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%933, %jasper_encoder.encoder.10.conv.13.weight, %jasper_encoder.encoder.10.conv.13.bias, %jasper_encoder.encoder.10.conv.13.running_mean, %jasper_encoder.encoder.10.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %935 : Float(16, 768, 503) = onnx::Relu(%934), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %936 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%935, %jasper_encoder.encoder.10.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %937 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%936, %jasper_encoder.encoder.10.conv.17.weight, %jasper_encoder.encoder.10.conv.17.bias, %jasper_encoder.encoder.10.conv.17.running_mean, %jasper_encoder.encoder.10.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %938 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.10.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %939 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%938, %jasper_encoder.encoder.10.res.0.1.weight, %jasper_encoder.encoder.10.res.0.1.bias, %jasper_encoder.encoder.10.res.0.1.running_mean, %jasper_encoder.encoder.10.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %940 : Float(16, 768, 503) = onnx::Add(%937, %939), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %941 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.10.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %942 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%941, %jasper_encoder.encoder.10.res.1.1.weight, %jasper_encoder.encoder.10.res.1.1.bias, %jasper_encoder.encoder.10.res.1.1.running_mean, %jasper_encoder.encoder.10.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %943 : Float(16, 768, 503) = onnx::Add(%940, %942), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %944 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.10.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %945 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%944, %jasper_encoder.encoder.10.res.2.1.weight, %jasper_encoder.encoder.10.res.2.1.bias, %jasper_encoder.encoder.10.res.2.1.running_mean, %jasper_encoder.encoder.10.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %946 : Float(16, 768, 503) = onnx::Add(%943, %945), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %947 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.10.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %948 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%947, %jasper_encoder.encoder.10.res.3.1.weight, %jasper_encoder.encoder.10.res.3.1.bias, %jasper_encoder.encoder.10.res.3.1.running_mean, %jasper_encoder.encoder.10.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %949 : Float(16, 768, 503) = onnx::Add(%946, %948), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %950 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.10.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %951 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%950, %jasper_encoder.encoder.10.res.4.1.weight, %jasper_encoder.encoder.10.res.4.1.bias, %jasper_encoder.encoder.10.res.4.1.running_mean, %jasper_encoder.encoder.10.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %952 : Float(16, 768, 503) = onnx::Add(%949, %951), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %953 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.10.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %954 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%953, %jasper_encoder.encoder.10.res.5.1.weight, %jasper_encoder.encoder.10.res.5.1.bias, %jasper_encoder.encoder.10.res.5.1.running_mean, %jasper_encoder.encoder.10.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %955 : Float(16, 768, 503) = onnx::Add(%952, %954), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %956 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.10.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %957 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%956, %jasper_encoder.encoder.10.res.6.1.weight, %jasper_encoder.encoder.10.res.6.1.bias, %jasper_encoder.encoder.10.res.6.1.running_mean, %jasper_encoder.encoder.10.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %958 : Float(16, 768, 503) = onnx::Add(%955, %957), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %959 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%842, %jasper_encoder.encoder.10.res.7.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %960 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%959, %jasper_encoder.encoder.10.res.7.1.weight, %jasper_encoder.encoder.10.res.7.1.bias, %jasper_encoder.encoder.10.res.7.1.running_mean, %jasper_encoder.encoder.10.res.7.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %961 : Float(16, 768, 503) = onnx::Add(%958, %960), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %962 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%881, %jasper_encoder.encoder.10.res.8.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %963 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%962, %jasper_encoder.encoder.10.res.8.1.weight, %jasper_encoder.encoder.10.res.8.1.bias, %jasper_encoder.encoder.10.res.8.1.running_mean, %jasper_encoder.encoder.10.res.8.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %964 : Float(16, 768, 503) = onnx::Add(%961, %963), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %965 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%923, %jasper_encoder.encoder.10.res.9.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %966 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%965, %jasper_encoder.encoder.10.res.9.1.weight, %jasper_encoder.encoder.10.res.9.1.bias, %jasper_encoder.encoder.10.res.9.1.running_mean, %jasper_encoder.encoder.10.res.9.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %967 : Float(16, 768, 503) = onnx::Add(%964, %966), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %968 : Float(16, 768, 503) = onnx::Relu(%967), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %969 : Float(16, 896, 503) = onnx::Conv[dilations=[2], group=1, kernel_shape=[29], pads=[28, 28], strides=[1]](%968, %jasper_encoder.encoder.11.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[11]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %970 : Float(16, 896, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%969, %jasper_encoder.encoder.11.conv.1.weight, %jasper_encoder.encoder.11.conv.1.bias, %jasper_encoder.encoder.11.conv.1.running_mean, %jasper_encoder.encoder.11.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[11]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %971 : Float(16, 896, 503) = onnx::Relu(%970), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[11]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %972 : Float(16, 1024, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%971, %jasper_encoder.encoder.12.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[12]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %973 : Float(16, 1024, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%972, %jasper_encoder.encoder.12.conv.1.weight, %jasper_encoder.encoder.12.conv.1.bias, %jasper_encoder.encoder.12.conv.1.running_mean, %jasper_encoder.encoder.12.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[12]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %974 : Float(16, 1024, 503) = onnx::Relu(%973), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[12]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %975 : Float(16, 29, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%974, %jasper_decoder.decoder_layers.0.weight, %jasper_decoder.decoder_layers.0.bias), scope: JasperEncoderDecoder/JasperDecoderForCTC[jasper_decoder]/Sequential[decoder_layers]/Conv1d[0] # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %976 : Float(16, 503, 29) = onnx::Transpose[perm=[0, 2, 1]](%975), scope: JasperEncoderDecoder/JasperDecoderForCTC[jasper_decoder] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:213:0\n", - " %LOGITS : Float(16, 503, 29) = onnx::LogSoftmax[axis=2](%976), scope: JasperEncoderDecoder/JasperDecoderForCTC[jasper_decoder] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1317:0\n", - " return (%LOGITS)\n", - "\n", - "Getting engine\n", - "Building TRT engine ....\n", - "Parsing returned True dynamic_shape= True \n", - "\n", - "TRT engine saved at jasper.plan ...\n", - "Got engine.\n", - "INTERENCE TIME: 144.90605099990717 ms\n", - "TRANSCRIPT: ['when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely']\n" - ], - "name": "stdout" - }, - { - "output_type": "stream", - "text": [ - "tcmalloc: large alloc 1331093504 bytes == 0x177018000 @ 0x7f4aeb036887 0x7f4ae992cc29 0x7f4ae992dafb 0x7f4ae992dbb4 0x7f4ae992df9c 0x7f4aa322f52f 0x7f4aa322f7b4 0x7f4a989e5390 0x7f4adfbafe91 0x7f4adf873014 0x5669ac 0x50a5c3 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4\n", - "tcmalloc: large alloc 1331093504 bytes == 0x7f48e6a92000 @ 0x7f4aeb0341e7 0x59203c 0x7f4adfbb026d 0x7f4adf873014 0x5669ac 0x50a5c3 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x507d64 0x50ae13 0x634c82\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1331092289\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 1331092289\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "tcmalloc: large alloc 1846083584 bytes == 0x7f482bf70000 @ 0x7f4aeb036887 0x7f4a669da95a 0x7f4a669cd422 0x7f4a66bc59c4 0x7f4a669b823f 0x7f4a74406e3a 0x7f4a74454652 0x5669ac 0x50a5c3 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x507d64 0x50ae13 0x634c82 0x634d37 0x6384ef 0x639091 0x4b0d00 0x7f4aeac31b97 0x5b250a\n", - "bash: line 6: 1642 Segmentation fault (core dumped) python ../trt/perf.py --ckpt_path ./jasper_fp16.pt --wav=example1.wav --model_toml=../configs/jasper10x5dr_nomask.toml --make_onnx --onnx_path jasper.onnx --engine_path jasper.plan\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "OFYVKESRdf-y" - }, - "source": [ - "### Inference from existing TensorRT FP32 plan\n", - "Inference with an existing plan can be launch with the `--use_existing_engine` flag." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "vzQOo6QLODZd", - "outputId": "2eabe8a0-4ff9-4f8c-caea-6df51fe42b50", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 182 - } - }, - "source": [ - "%%bash\n", - "export PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", - "python ../trt/perf.py \\\n", - "--wav=./example1.wav \\\n", - "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", - "--use_existing_engine --engine_path jasper.plan" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Getting component\n", - "Getting engine\n", - "Got engine.\n", - "INTERENCE TIME: 138.72180600014872 ms\n", - "TRANSCRIPT: ['when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely']\n" - ], - "name": "stdout" - }, - { - "output_type": "stream", - "text": [ - "tcmalloc: large alloc 1331838976 bytes == 0xca68e000 @ 0x7f3ca42a11e7 0x59203c 0x4ca610 0x56697a 0x5a4be1 0x5a5cda 0x4ce182 0x50a2bf 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x507d64 0x50ae13 0x634c82 0x634d37 0x6384ef 0x639091 0x4b0d00 0x7f3ca3e9eb97 0x5b250a\n", - "tcmalloc: large alloc 1243955200 bytes == 0x11a4b2000 @ 0x7f3ca42a3887 0x7f3c1fe82667 0x7f3c1fe71ec7 0x7f3c1fc2ff83 0x7f3c1fc39dc8 0x7f3c2d6603c6 0x7f3c2d6c1652 0x5669ac 0x50a5c3 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x507d64 0x50ae13 0x634c82 0x634d37 0x6384ef 0x639091 0x4b0d00 0x7f3ca3e9eb97 0x5b250a\n", - "bash: line 5: 1736 Segmentation fault (core dumped) python ../trt/perf.py --wav=./example1.wav --model_toml=../configs/jasper10x5dr_nomask.toml --use_existing_engine --engine_path jasper.plan\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Pg4X1odgOS1i" - }, - "source": [ - "## FP16 Inference with TensorRT\n", - "### Creating TensorRT FP16 execution plan\n", - "\n", - "We will next create an FP16 TRT inference plan. \n", - "\n", - "To run inference of the input audio file using automatic mixed precision, add the argument `--trt_fp16`. Using automatic mixed precision, the inference time can be reduced efficiently compared to that of using fp32 (building the engine for the first time can take several minutes).\n", - "\n", - "**Important Note:** Efficient FP16 inference requires a Volta, Turing or newer generation GPUs. On Google Colab, this normally means a T4 GPU. On the older K80 GPUs, FP16 performance might actually degrade from an FP32 TRT model." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "x2n_2cZYdGOg", - "outputId": "e6d3e454-ae39-470a-f21f-e88fc0663a0d", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 1000 - } - }, - "source": [ - "%%bash\n", - "PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", - "python ../trt/perf.py \\\n", - "--ckpt_path ./jasper_fp16.pt --wav=example1.wav \\\n", - "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", - "--make_onnx --onnx_path jasper.onnx \\\n", - "--engine_path jasper_fp16.plan \\\n", - "--trt_fp16" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Getting component\n", - "graph(%FEATURES : Float(16, 64, 1006),\n", - " %jasper_encoder.encoder.0.conv.0.weight : Float(256, 64, 11),\n", - " %jasper_encoder.encoder.0.conv.1.weight : Float(256),\n", - " %jasper_encoder.encoder.0.conv.1.bias : Float(256),\n", - " %jasper_encoder.encoder.0.conv.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.0.conv.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.0.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.1.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.1.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.4.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.5.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.5.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.5.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.5.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.8.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.9.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.9.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.9.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.9.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.12.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.13.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.13.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.13.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.13.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.conv.16.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.1.conv.17.weight : Float(256),\n", - " %jasper_encoder.encoder.1.conv.17.bias : Float(256),\n", - " %jasper_encoder.encoder.1.conv.17.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.conv.17.running_var : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.0.weight : Float(256, 256, 1),\n", - " %jasper_encoder.encoder.1.res.0.1.weight : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.1.bias : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.1.res.0.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.0.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.1.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.1.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.4.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.5.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.5.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.5.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.5.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.8.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.9.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.9.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.9.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.9.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.12.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.13.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.13.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.13.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.13.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.conv.16.weight : Float(256, 256, 11),\n", - " %jasper_encoder.encoder.2.conv.17.weight : Float(256),\n", - " %jasper_encoder.encoder.2.conv.17.bias : Float(256),\n", - " %jasper_encoder.encoder.2.conv.17.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.conv.17.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.0.weight : Float(256, 256, 1),\n", - " %jasper_encoder.encoder.2.res.0.1.weight : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.1.bias : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.res.0.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.0.weight : Float(256, 256, 1),\n", - " %jasper_encoder.encoder.2.res.1.1.weight : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.1.bias : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.1.running_mean : Float(256),\n", - " %jasper_encoder.encoder.2.res.1.1.running_var : Float(256),\n", - " %jasper_encoder.encoder.3.conv.0.weight : Float(384, 256, 13),\n", - " %jasper_encoder.encoder.3.conv.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.4.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.5.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.5.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.5.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.5.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.8.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.9.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.9.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.9.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.9.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.12.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.13.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.13.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.13.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.13.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.conv.16.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.3.conv.17.weight : Float(384),\n", - " %jasper_encoder.encoder.3.conv.17.bias : Float(384),\n", - " %jasper_encoder.encoder.3.conv.17.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.conv.17.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.3.res.0.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.res.0.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.3.res.1.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.res.1.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.3.res.2.1.weight : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.1.bias : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.3.res.2.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.0.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.4.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.5.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.5.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.5.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.5.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.8.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.9.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.9.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.9.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.9.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.12.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.13.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.13.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.13.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.13.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.conv.16.weight : Float(384, 384, 13),\n", - " %jasper_encoder.encoder.4.conv.17.weight : Float(384),\n", - " %jasper_encoder.encoder.4.conv.17.bias : Float(384),\n", - " %jasper_encoder.encoder.4.conv.17.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.conv.17.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.4.res.0.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.0.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.4.res.1.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.1.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.0.weight : Float(384, 256, 1),\n", - " %jasper_encoder.encoder.4.res.2.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.2.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.0.weight : Float(384, 384, 1),\n", - " %jasper_encoder.encoder.4.res.3.1.weight : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.1.bias : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.1.running_mean : Float(384),\n", - " %jasper_encoder.encoder.4.res.3.1.running_var : Float(384),\n", - " %jasper_encoder.encoder.5.conv.0.weight : Float(512, 384, 17),\n", - " %jasper_encoder.encoder.5.conv.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.4.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.5.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.5.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.5.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.5.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.8.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.9.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.9.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.9.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.9.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.12.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.13.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.13.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.13.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.13.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.conv.16.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.5.conv.17.weight : Float(512),\n", - " %jasper_encoder.encoder.5.conv.17.bias : Float(512),\n", - " %jasper_encoder.encoder.5.conv.17.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.conv.17.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.5.res.0.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.0.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.5.res.1.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.1.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.5.res.2.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.2.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.5.res.3.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.3.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.5.res.4.1.weight : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.1.bias : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.5.res.4.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.0.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.4.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.5.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.5.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.5.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.5.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.8.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.9.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.9.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.9.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.9.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.12.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.13.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.13.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.13.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.13.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.conv.16.weight : Float(512, 512, 17),\n", - " %jasper_encoder.encoder.6.conv.17.weight : Float(512),\n", - " %jasper_encoder.encoder.6.conv.17.bias : Float(512),\n", - " %jasper_encoder.encoder.6.conv.17.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.conv.17.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.6.res.0.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.0.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.6.res.1.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.1.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.0.weight : Float(512, 256, 1),\n", - " %jasper_encoder.encoder.6.res.2.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.2.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.6.res.3.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.3.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.0.weight : Float(512, 384, 1),\n", - " %jasper_encoder.encoder.6.res.4.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.4.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.0.weight : Float(512, 512, 1),\n", - " %jasper_encoder.encoder.6.res.5.1.weight : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.1.bias : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.1.running_mean : Float(512),\n", - " %jasper_encoder.encoder.6.res.5.1.running_var : Float(512),\n", - " %jasper_encoder.encoder.7.conv.0.weight : Float(640, 512, 21),\n", - " %jasper_encoder.encoder.7.conv.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.4.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.5.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.5.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.5.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.5.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.8.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.9.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.9.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.9.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.9.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.12.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.13.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.13.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.13.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.13.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.conv.16.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.7.conv.17.weight : Float(640),\n", - " %jasper_encoder.encoder.7.conv.17.bias : Float(640),\n", - " %jasper_encoder.encoder.7.conv.17.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.conv.17.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.7.res.0.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.0.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.7.res.1.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.1.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.7.res.2.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.2.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.7.res.3.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.3.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.7.res.4.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.4.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.7.res.5.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.5.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.7.res.6.1.weight : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.1.bias : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.7.res.6.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.0.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.4.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.5.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.5.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.5.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.5.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.8.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.9.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.9.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.9.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.9.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.12.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.13.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.13.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.13.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.13.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.conv.16.weight : Float(640, 640, 21),\n", - " %jasper_encoder.encoder.8.conv.17.weight : Float(640),\n", - " %jasper_encoder.encoder.8.conv.17.bias : Float(640),\n", - " %jasper_encoder.encoder.8.conv.17.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.conv.17.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.8.res.0.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.0.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.8.res.1.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.1.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.0.weight : Float(640, 256, 1),\n", - " %jasper_encoder.encoder.8.res.2.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.2.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.8.res.3.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.3.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.0.weight : Float(640, 384, 1),\n", - " %jasper_encoder.encoder.8.res.4.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.4.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.8.res.5.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.5.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.0.weight : Float(640, 512, 1),\n", - " %jasper_encoder.encoder.8.res.6.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.6.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.0.weight : Float(640, 640, 1),\n", - " %jasper_encoder.encoder.8.res.7.1.weight : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.1.bias : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.1.running_mean : Float(640),\n", - " %jasper_encoder.encoder.8.res.7.1.running_var : Float(640),\n", - " %jasper_encoder.encoder.9.conv.0.weight : Float(768, 640, 25),\n", - " %jasper_encoder.encoder.9.conv.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.4.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.5.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.5.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.5.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.5.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.8.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.9.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.9.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.9.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.9.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.12.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.13.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.13.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.13.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.13.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.conv.16.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.9.conv.17.weight : Float(768),\n", - " %jasper_encoder.encoder.9.conv.17.bias : Float(768),\n", - " %jasper_encoder.encoder.9.conv.17.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.conv.17.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.9.res.0.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.0.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.9.res.1.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.1.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.9.res.2.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.2.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.9.res.3.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.3.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.9.res.4.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.4.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.9.res.5.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.5.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.9.res.6.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.6.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.9.res.7.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.7.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.9.res.8.1.weight : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.1.bias : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.9.res.8.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.0.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.4.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.5.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.5.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.5.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.5.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.8.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.9.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.9.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.9.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.9.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.12.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.13.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.13.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.13.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.13.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.conv.16.weight : Float(768, 768, 25),\n", - " %jasper_encoder.encoder.10.conv.17.weight : Float(768),\n", - " %jasper_encoder.encoder.10.conv.17.bias : Float(768),\n", - " %jasper_encoder.encoder.10.conv.17.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.conv.17.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.10.res.0.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.0.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.10.res.1.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.1.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.0.weight : Float(768, 256, 1),\n", - " %jasper_encoder.encoder.10.res.2.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.2.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.10.res.3.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.3.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.0.weight : Float(768, 384, 1),\n", - " %jasper_encoder.encoder.10.res.4.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.4.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.10.res.5.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.5.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.0.weight : Float(768, 512, 1),\n", - " %jasper_encoder.encoder.10.res.6.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.6.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.10.res.7.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.7.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.0.weight : Float(768, 640, 1),\n", - " %jasper_encoder.encoder.10.res.8.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.8.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.0.weight : Float(768, 768, 1),\n", - " %jasper_encoder.encoder.10.res.9.1.weight : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.1.bias : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.1.running_mean : Float(768),\n", - " %jasper_encoder.encoder.10.res.9.1.running_var : Float(768),\n", - " %jasper_encoder.encoder.11.conv.0.weight : Float(896, 768, 29),\n", - " %jasper_encoder.encoder.11.conv.1.weight : Float(896),\n", - " %jasper_encoder.encoder.11.conv.1.bias : Float(896),\n", - " %jasper_encoder.encoder.11.conv.1.running_mean : Float(896),\n", - " %jasper_encoder.encoder.11.conv.1.running_var : Float(896),\n", - " %jasper_encoder.encoder.12.conv.0.weight : Float(1024, 896, 1),\n", - " %jasper_encoder.encoder.12.conv.1.weight : Float(1024),\n", - " %jasper_encoder.encoder.12.conv.1.bias : Float(1024),\n", - " %jasper_encoder.encoder.12.conv.1.running_mean : Float(1024),\n", - " %jasper_encoder.encoder.12.conv.1.running_var : Float(1024),\n", - " %jasper_decoder.decoder_layers.0.weight : Float(29, 1024, 1),\n", - " %jasper_decoder.decoder_layers.0.bias : Float(29)):\n", - " %651 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[2]](%FEATURES, %jasper_encoder.encoder.0.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[0]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %652 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%651, %jasper_encoder.encoder.0.conv.1.weight, %jasper_encoder.encoder.0.conv.1.bias, %jasper_encoder.encoder.0.conv.1.running_mean, %jasper_encoder.encoder.0.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[0]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %653 : Float(16, 256, 503) = onnx::Relu(%652), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[0]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %654 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%653, %jasper_encoder.encoder.1.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %655 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%654, %jasper_encoder.encoder.1.conv.1.weight, %jasper_encoder.encoder.1.conv.1.bias, %jasper_encoder.encoder.1.conv.1.running_mean, %jasper_encoder.encoder.1.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %656 : Float(16, 256, 503) = onnx::Relu(%655), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %657 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%656, %jasper_encoder.encoder.1.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %658 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%657, %jasper_encoder.encoder.1.conv.5.weight, %jasper_encoder.encoder.1.conv.5.bias, %jasper_encoder.encoder.1.conv.5.running_mean, %jasper_encoder.encoder.1.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %659 : Float(16, 256, 503) = onnx::Relu(%658), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %660 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%659, %jasper_encoder.encoder.1.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %661 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%660, %jasper_encoder.encoder.1.conv.9.weight, %jasper_encoder.encoder.1.conv.9.bias, %jasper_encoder.encoder.1.conv.9.running_mean, %jasper_encoder.encoder.1.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %662 : Float(16, 256, 503) = onnx::Relu(%661), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %663 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%662, %jasper_encoder.encoder.1.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %664 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%663, %jasper_encoder.encoder.1.conv.13.weight, %jasper_encoder.encoder.1.conv.13.bias, %jasper_encoder.encoder.1.conv.13.running_mean, %jasper_encoder.encoder.1.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %665 : Float(16, 256, 503) = onnx::Relu(%664), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %666 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%665, %jasper_encoder.encoder.1.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %667 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%666, %jasper_encoder.encoder.1.conv.17.weight, %jasper_encoder.encoder.1.conv.17.bias, %jasper_encoder.encoder.1.conv.17.running_mean, %jasper_encoder.encoder.1.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %668 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.1.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %669 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%668, %jasper_encoder.encoder.1.res.0.1.weight, %jasper_encoder.encoder.1.res.0.1.bias, %jasper_encoder.encoder.1.res.0.1.running_mean, %jasper_encoder.encoder.1.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %670 : Float(16, 256, 503) = onnx::Add(%667, %669), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %671 : Float(16, 256, 503) = onnx::Relu(%670), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[1]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %672 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%671, %jasper_encoder.encoder.2.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %673 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%672, %jasper_encoder.encoder.2.conv.1.weight, %jasper_encoder.encoder.2.conv.1.bias, %jasper_encoder.encoder.2.conv.1.running_mean, %jasper_encoder.encoder.2.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %674 : Float(16, 256, 503) = onnx::Relu(%673), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %675 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%674, %jasper_encoder.encoder.2.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %676 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%675, %jasper_encoder.encoder.2.conv.5.weight, %jasper_encoder.encoder.2.conv.5.bias, %jasper_encoder.encoder.2.conv.5.running_mean, %jasper_encoder.encoder.2.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %677 : Float(16, 256, 503) = onnx::Relu(%676), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %678 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%677, %jasper_encoder.encoder.2.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %679 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%678, %jasper_encoder.encoder.2.conv.9.weight, %jasper_encoder.encoder.2.conv.9.bias, %jasper_encoder.encoder.2.conv.9.running_mean, %jasper_encoder.encoder.2.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %680 : Float(16, 256, 503) = onnx::Relu(%679), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %681 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%680, %jasper_encoder.encoder.2.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %682 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%681, %jasper_encoder.encoder.2.conv.13.weight, %jasper_encoder.encoder.2.conv.13.bias, %jasper_encoder.encoder.2.conv.13.running_mean, %jasper_encoder.encoder.2.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %683 : Float(16, 256, 503) = onnx::Relu(%682), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %684 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[11], pads=[5, 5], strides=[1]](%683, %jasper_encoder.encoder.2.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %685 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%684, %jasper_encoder.encoder.2.conv.17.weight, %jasper_encoder.encoder.2.conv.17.bias, %jasper_encoder.encoder.2.conv.17.running_mean, %jasper_encoder.encoder.2.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %686 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.2.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %687 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%686, %jasper_encoder.encoder.2.res.0.1.weight, %jasper_encoder.encoder.2.res.0.1.bias, %jasper_encoder.encoder.2.res.0.1.running_mean, %jasper_encoder.encoder.2.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %688 : Float(16, 256, 503) = onnx::Add(%685, %687), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %689 : Float(16, 256, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.2.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %690 : Float(16, 256, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%689, %jasper_encoder.encoder.2.res.1.1.weight, %jasper_encoder.encoder.2.res.1.1.bias, %jasper_encoder.encoder.2.res.1.1.running_mean, %jasper_encoder.encoder.2.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %691 : Float(16, 256, 503) = onnx::Add(%688, %690), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %692 : Float(16, 256, 503) = onnx::Relu(%691), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[2]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %693 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%692, %jasper_encoder.encoder.3.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %694 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%693, %jasper_encoder.encoder.3.conv.1.weight, %jasper_encoder.encoder.3.conv.1.bias, %jasper_encoder.encoder.3.conv.1.running_mean, %jasper_encoder.encoder.3.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %695 : Float(16, 384, 503) = onnx::Relu(%694), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %696 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%695, %jasper_encoder.encoder.3.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %697 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%696, %jasper_encoder.encoder.3.conv.5.weight, %jasper_encoder.encoder.3.conv.5.bias, %jasper_encoder.encoder.3.conv.5.running_mean, %jasper_encoder.encoder.3.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %698 : Float(16, 384, 503) = onnx::Relu(%697), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %699 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%698, %jasper_encoder.encoder.3.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %700 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%699, %jasper_encoder.encoder.3.conv.9.weight, %jasper_encoder.encoder.3.conv.9.bias, %jasper_encoder.encoder.3.conv.9.running_mean, %jasper_encoder.encoder.3.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %701 : Float(16, 384, 503) = onnx::Relu(%700), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %702 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%701, %jasper_encoder.encoder.3.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %703 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%702, %jasper_encoder.encoder.3.conv.13.weight, %jasper_encoder.encoder.3.conv.13.bias, %jasper_encoder.encoder.3.conv.13.running_mean, %jasper_encoder.encoder.3.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %704 : Float(16, 384, 503) = onnx::Relu(%703), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %705 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%704, %jasper_encoder.encoder.3.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %706 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%705, %jasper_encoder.encoder.3.conv.17.weight, %jasper_encoder.encoder.3.conv.17.bias, %jasper_encoder.encoder.3.conv.17.running_mean, %jasper_encoder.encoder.3.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %707 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.3.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %708 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%707, %jasper_encoder.encoder.3.res.0.1.weight, %jasper_encoder.encoder.3.res.0.1.bias, %jasper_encoder.encoder.3.res.0.1.running_mean, %jasper_encoder.encoder.3.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %709 : Float(16, 384, 503) = onnx::Add(%706, %708), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %710 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.3.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %711 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%710, %jasper_encoder.encoder.3.res.1.1.weight, %jasper_encoder.encoder.3.res.1.1.bias, %jasper_encoder.encoder.3.res.1.1.running_mean, %jasper_encoder.encoder.3.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %712 : Float(16, 384, 503) = onnx::Add(%709, %711), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %713 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.3.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %714 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%713, %jasper_encoder.encoder.3.res.2.1.weight, %jasper_encoder.encoder.3.res.2.1.bias, %jasper_encoder.encoder.3.res.2.1.running_mean, %jasper_encoder.encoder.3.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %715 : Float(16, 384, 503) = onnx::Add(%712, %714), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %716 : Float(16, 384, 503) = onnx::Relu(%715), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[3]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %717 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%716, %jasper_encoder.encoder.4.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %718 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%717, %jasper_encoder.encoder.4.conv.1.weight, %jasper_encoder.encoder.4.conv.1.bias, %jasper_encoder.encoder.4.conv.1.running_mean, %jasper_encoder.encoder.4.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %719 : Float(16, 384, 503) = onnx::Relu(%718), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %720 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%719, %jasper_encoder.encoder.4.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %721 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%720, %jasper_encoder.encoder.4.conv.5.weight, %jasper_encoder.encoder.4.conv.5.bias, %jasper_encoder.encoder.4.conv.5.running_mean, %jasper_encoder.encoder.4.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %722 : Float(16, 384, 503) = onnx::Relu(%721), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %723 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%722, %jasper_encoder.encoder.4.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %724 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%723, %jasper_encoder.encoder.4.conv.9.weight, %jasper_encoder.encoder.4.conv.9.bias, %jasper_encoder.encoder.4.conv.9.running_mean, %jasper_encoder.encoder.4.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %725 : Float(16, 384, 503) = onnx::Relu(%724), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %726 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%725, %jasper_encoder.encoder.4.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %727 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%726, %jasper_encoder.encoder.4.conv.13.weight, %jasper_encoder.encoder.4.conv.13.bias, %jasper_encoder.encoder.4.conv.13.running_mean, %jasper_encoder.encoder.4.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %728 : Float(16, 384, 503) = onnx::Relu(%727), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %729 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[13], pads=[6, 6], strides=[1]](%728, %jasper_encoder.encoder.4.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %730 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%729, %jasper_encoder.encoder.4.conv.17.weight, %jasper_encoder.encoder.4.conv.17.bias, %jasper_encoder.encoder.4.conv.17.running_mean, %jasper_encoder.encoder.4.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %731 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.4.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %732 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%731, %jasper_encoder.encoder.4.res.0.1.weight, %jasper_encoder.encoder.4.res.0.1.bias, %jasper_encoder.encoder.4.res.0.1.running_mean, %jasper_encoder.encoder.4.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %733 : Float(16, 384, 503) = onnx::Add(%730, %732), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %734 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.4.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %735 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%734, %jasper_encoder.encoder.4.res.1.1.weight, %jasper_encoder.encoder.4.res.1.1.bias, %jasper_encoder.encoder.4.res.1.1.running_mean, %jasper_encoder.encoder.4.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %736 : Float(16, 384, 503) = onnx::Add(%733, %735), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %737 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.4.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %738 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%737, %jasper_encoder.encoder.4.res.2.1.weight, %jasper_encoder.encoder.4.res.2.1.bias, %jasper_encoder.encoder.4.res.2.1.running_mean, %jasper_encoder.encoder.4.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %739 : Float(16, 384, 503) = onnx::Add(%736, %738), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %740 : Float(16, 384, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.4.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %741 : Float(16, 384, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%740, %jasper_encoder.encoder.4.res.3.1.weight, %jasper_encoder.encoder.4.res.3.1.bias, %jasper_encoder.encoder.4.res.3.1.running_mean, %jasper_encoder.encoder.4.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %742 : Float(16, 384, 503) = onnx::Add(%739, %741), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %743 : Float(16, 384, 503) = onnx::Relu(%742), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[4]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %744 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%743, %jasper_encoder.encoder.5.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %745 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%744, %jasper_encoder.encoder.5.conv.1.weight, %jasper_encoder.encoder.5.conv.1.bias, %jasper_encoder.encoder.5.conv.1.running_mean, %jasper_encoder.encoder.5.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %746 : Float(16, 512, 503) = onnx::Relu(%745), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %747 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%746, %jasper_encoder.encoder.5.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %748 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%747, %jasper_encoder.encoder.5.conv.5.weight, %jasper_encoder.encoder.5.conv.5.bias, %jasper_encoder.encoder.5.conv.5.running_mean, %jasper_encoder.encoder.5.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %749 : Float(16, 512, 503) = onnx::Relu(%748), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %750 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%749, %jasper_encoder.encoder.5.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %751 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%750, %jasper_encoder.encoder.5.conv.9.weight, %jasper_encoder.encoder.5.conv.9.bias, %jasper_encoder.encoder.5.conv.9.running_mean, %jasper_encoder.encoder.5.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %752 : Float(16, 512, 503) = onnx::Relu(%751), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %753 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%752, %jasper_encoder.encoder.5.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %754 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%753, %jasper_encoder.encoder.5.conv.13.weight, %jasper_encoder.encoder.5.conv.13.bias, %jasper_encoder.encoder.5.conv.13.running_mean, %jasper_encoder.encoder.5.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %755 : Float(16, 512, 503) = onnx::Relu(%754), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %756 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%755, %jasper_encoder.encoder.5.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %757 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%756, %jasper_encoder.encoder.5.conv.17.weight, %jasper_encoder.encoder.5.conv.17.bias, %jasper_encoder.encoder.5.conv.17.running_mean, %jasper_encoder.encoder.5.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %758 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.5.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %759 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%758, %jasper_encoder.encoder.5.res.0.1.weight, %jasper_encoder.encoder.5.res.0.1.bias, %jasper_encoder.encoder.5.res.0.1.running_mean, %jasper_encoder.encoder.5.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %760 : Float(16, 512, 503) = onnx::Add(%757, %759), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %761 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.5.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %762 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%761, %jasper_encoder.encoder.5.res.1.1.weight, %jasper_encoder.encoder.5.res.1.1.bias, %jasper_encoder.encoder.5.res.1.1.running_mean, %jasper_encoder.encoder.5.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %763 : Float(16, 512, 503) = onnx::Add(%760, %762), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %764 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.5.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %765 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%764, %jasper_encoder.encoder.5.res.2.1.weight, %jasper_encoder.encoder.5.res.2.1.bias, %jasper_encoder.encoder.5.res.2.1.running_mean, %jasper_encoder.encoder.5.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %766 : Float(16, 512, 503) = onnx::Add(%763, %765), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %767 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.5.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %768 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%767, %jasper_encoder.encoder.5.res.3.1.weight, %jasper_encoder.encoder.5.res.3.1.bias, %jasper_encoder.encoder.5.res.3.1.running_mean, %jasper_encoder.encoder.5.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %769 : Float(16, 512, 503) = onnx::Add(%766, %768), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %770 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.5.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %771 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%770, %jasper_encoder.encoder.5.res.4.1.weight, %jasper_encoder.encoder.5.res.4.1.bias, %jasper_encoder.encoder.5.res.4.1.running_mean, %jasper_encoder.encoder.5.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %772 : Float(16, 512, 503) = onnx::Add(%769, %771), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %773 : Float(16, 512, 503) = onnx::Relu(%772), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[5]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %774 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%773, %jasper_encoder.encoder.6.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %775 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%774, %jasper_encoder.encoder.6.conv.1.weight, %jasper_encoder.encoder.6.conv.1.bias, %jasper_encoder.encoder.6.conv.1.running_mean, %jasper_encoder.encoder.6.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %776 : Float(16, 512, 503) = onnx::Relu(%775), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %777 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%776, %jasper_encoder.encoder.6.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %778 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%777, %jasper_encoder.encoder.6.conv.5.weight, %jasper_encoder.encoder.6.conv.5.bias, %jasper_encoder.encoder.6.conv.5.running_mean, %jasper_encoder.encoder.6.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %779 : Float(16, 512, 503) = onnx::Relu(%778), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %780 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%779, %jasper_encoder.encoder.6.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %781 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%780, %jasper_encoder.encoder.6.conv.9.weight, %jasper_encoder.encoder.6.conv.9.bias, %jasper_encoder.encoder.6.conv.9.running_mean, %jasper_encoder.encoder.6.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %782 : Float(16, 512, 503) = onnx::Relu(%781), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %783 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%782, %jasper_encoder.encoder.6.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %784 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%783, %jasper_encoder.encoder.6.conv.13.weight, %jasper_encoder.encoder.6.conv.13.bias, %jasper_encoder.encoder.6.conv.13.running_mean, %jasper_encoder.encoder.6.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %785 : Float(16, 512, 503) = onnx::Relu(%784), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %786 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[17], pads=[8, 8], strides=[1]](%785, %jasper_encoder.encoder.6.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %787 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%786, %jasper_encoder.encoder.6.conv.17.weight, %jasper_encoder.encoder.6.conv.17.bias, %jasper_encoder.encoder.6.conv.17.running_mean, %jasper_encoder.encoder.6.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %788 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.6.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %789 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%788, %jasper_encoder.encoder.6.res.0.1.weight, %jasper_encoder.encoder.6.res.0.1.bias, %jasper_encoder.encoder.6.res.0.1.running_mean, %jasper_encoder.encoder.6.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %790 : Float(16, 512, 503) = onnx::Add(%787, %789), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %791 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.6.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %792 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%791, %jasper_encoder.encoder.6.res.1.1.weight, %jasper_encoder.encoder.6.res.1.1.bias, %jasper_encoder.encoder.6.res.1.1.running_mean, %jasper_encoder.encoder.6.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %793 : Float(16, 512, 503) = onnx::Add(%790, %792), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %794 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.6.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %795 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%794, %jasper_encoder.encoder.6.res.2.1.weight, %jasper_encoder.encoder.6.res.2.1.bias, %jasper_encoder.encoder.6.res.2.1.running_mean, %jasper_encoder.encoder.6.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %796 : Float(16, 512, 503) = onnx::Add(%793, %795), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %797 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.6.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %798 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%797, %jasper_encoder.encoder.6.res.3.1.weight, %jasper_encoder.encoder.6.res.3.1.bias, %jasper_encoder.encoder.6.res.3.1.running_mean, %jasper_encoder.encoder.6.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %799 : Float(16, 512, 503) = onnx::Add(%796, %798), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %800 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.6.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %801 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%800, %jasper_encoder.encoder.6.res.4.1.weight, %jasper_encoder.encoder.6.res.4.1.bias, %jasper_encoder.encoder.6.res.4.1.running_mean, %jasper_encoder.encoder.6.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %802 : Float(16, 512, 503) = onnx::Add(%799, %801), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %803 : Float(16, 512, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.6.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %804 : Float(16, 512, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%803, %jasper_encoder.encoder.6.res.5.1.weight, %jasper_encoder.encoder.6.res.5.1.bias, %jasper_encoder.encoder.6.res.5.1.running_mean, %jasper_encoder.encoder.6.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %805 : Float(16, 512, 503) = onnx::Add(%802, %804), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %806 : Float(16, 512, 503) = onnx::Relu(%805), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[6]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %807 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%806, %jasper_encoder.encoder.7.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %808 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%807, %jasper_encoder.encoder.7.conv.1.weight, %jasper_encoder.encoder.7.conv.1.bias, %jasper_encoder.encoder.7.conv.1.running_mean, %jasper_encoder.encoder.7.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %809 : Float(16, 640, 503) = onnx::Relu(%808), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %810 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%809, %jasper_encoder.encoder.7.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %811 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%810, %jasper_encoder.encoder.7.conv.5.weight, %jasper_encoder.encoder.7.conv.5.bias, %jasper_encoder.encoder.7.conv.5.running_mean, %jasper_encoder.encoder.7.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %812 : Float(16, 640, 503) = onnx::Relu(%811), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %813 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%812, %jasper_encoder.encoder.7.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %814 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%813, %jasper_encoder.encoder.7.conv.9.weight, %jasper_encoder.encoder.7.conv.9.bias, %jasper_encoder.encoder.7.conv.9.running_mean, %jasper_encoder.encoder.7.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %815 : Float(16, 640, 503) = onnx::Relu(%814), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %816 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%815, %jasper_encoder.encoder.7.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %817 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%816, %jasper_encoder.encoder.7.conv.13.weight, %jasper_encoder.encoder.7.conv.13.bias, %jasper_encoder.encoder.7.conv.13.running_mean, %jasper_encoder.encoder.7.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %818 : Float(16, 640, 503) = onnx::Relu(%817), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %819 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%818, %jasper_encoder.encoder.7.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %820 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%819, %jasper_encoder.encoder.7.conv.17.weight, %jasper_encoder.encoder.7.conv.17.bias, %jasper_encoder.encoder.7.conv.17.running_mean, %jasper_encoder.encoder.7.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %821 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.7.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %822 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%821, %jasper_encoder.encoder.7.res.0.1.weight, %jasper_encoder.encoder.7.res.0.1.bias, %jasper_encoder.encoder.7.res.0.1.running_mean, %jasper_encoder.encoder.7.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %823 : Float(16, 640, 503) = onnx::Add(%820, %822), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %824 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.7.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %825 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%824, %jasper_encoder.encoder.7.res.1.1.weight, %jasper_encoder.encoder.7.res.1.1.bias, %jasper_encoder.encoder.7.res.1.1.running_mean, %jasper_encoder.encoder.7.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %826 : Float(16, 640, 503) = onnx::Add(%823, %825), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %827 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.7.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %828 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%827, %jasper_encoder.encoder.7.res.2.1.weight, %jasper_encoder.encoder.7.res.2.1.bias, %jasper_encoder.encoder.7.res.2.1.running_mean, %jasper_encoder.encoder.7.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %829 : Float(16, 640, 503) = onnx::Add(%826, %828), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %830 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.7.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %831 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%830, %jasper_encoder.encoder.7.res.3.1.weight, %jasper_encoder.encoder.7.res.3.1.bias, %jasper_encoder.encoder.7.res.3.1.running_mean, %jasper_encoder.encoder.7.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %832 : Float(16, 640, 503) = onnx::Add(%829, %831), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %833 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.7.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %834 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%833, %jasper_encoder.encoder.7.res.4.1.weight, %jasper_encoder.encoder.7.res.4.1.bias, %jasper_encoder.encoder.7.res.4.1.running_mean, %jasper_encoder.encoder.7.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %835 : Float(16, 640, 503) = onnx::Add(%832, %834), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %836 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.7.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %837 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%836, %jasper_encoder.encoder.7.res.5.1.weight, %jasper_encoder.encoder.7.res.5.1.bias, %jasper_encoder.encoder.7.res.5.1.running_mean, %jasper_encoder.encoder.7.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %838 : Float(16, 640, 503) = onnx::Add(%835, %837), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %839 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.7.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %840 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%839, %jasper_encoder.encoder.7.res.6.1.weight, %jasper_encoder.encoder.7.res.6.1.bias, %jasper_encoder.encoder.7.res.6.1.running_mean, %jasper_encoder.encoder.7.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %841 : Float(16, 640, 503) = onnx::Add(%838, %840), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %842 : Float(16, 640, 503) = onnx::Relu(%841), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[7]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %843 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%842, %jasper_encoder.encoder.8.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %844 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%843, %jasper_encoder.encoder.8.conv.1.weight, %jasper_encoder.encoder.8.conv.1.bias, %jasper_encoder.encoder.8.conv.1.running_mean, %jasper_encoder.encoder.8.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %845 : Float(16, 640, 503) = onnx::Relu(%844), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %846 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%845, %jasper_encoder.encoder.8.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %847 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%846, %jasper_encoder.encoder.8.conv.5.weight, %jasper_encoder.encoder.8.conv.5.bias, %jasper_encoder.encoder.8.conv.5.running_mean, %jasper_encoder.encoder.8.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %848 : Float(16, 640, 503) = onnx::Relu(%847), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %849 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%848, %jasper_encoder.encoder.8.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %850 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%849, %jasper_encoder.encoder.8.conv.9.weight, %jasper_encoder.encoder.8.conv.9.bias, %jasper_encoder.encoder.8.conv.9.running_mean, %jasper_encoder.encoder.8.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %851 : Float(16, 640, 503) = onnx::Relu(%850), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %852 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%851, %jasper_encoder.encoder.8.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %853 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%852, %jasper_encoder.encoder.8.conv.13.weight, %jasper_encoder.encoder.8.conv.13.bias, %jasper_encoder.encoder.8.conv.13.running_mean, %jasper_encoder.encoder.8.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %854 : Float(16, 640, 503) = onnx::Relu(%853), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %855 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[21], pads=[10, 10], strides=[1]](%854, %jasper_encoder.encoder.8.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %856 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%855, %jasper_encoder.encoder.8.conv.17.weight, %jasper_encoder.encoder.8.conv.17.bias, %jasper_encoder.encoder.8.conv.17.running_mean, %jasper_encoder.encoder.8.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %857 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.8.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %858 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%857, %jasper_encoder.encoder.8.res.0.1.weight, %jasper_encoder.encoder.8.res.0.1.bias, %jasper_encoder.encoder.8.res.0.1.running_mean, %jasper_encoder.encoder.8.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %859 : Float(16, 640, 503) = onnx::Add(%856, %858), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %860 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.8.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %861 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%860, %jasper_encoder.encoder.8.res.1.1.weight, %jasper_encoder.encoder.8.res.1.1.bias, %jasper_encoder.encoder.8.res.1.1.running_mean, %jasper_encoder.encoder.8.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %862 : Float(16, 640, 503) = onnx::Add(%859, %861), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %863 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.8.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %864 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%863, %jasper_encoder.encoder.8.res.2.1.weight, %jasper_encoder.encoder.8.res.2.1.bias, %jasper_encoder.encoder.8.res.2.1.running_mean, %jasper_encoder.encoder.8.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %865 : Float(16, 640, 503) = onnx::Add(%862, %864), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %866 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.8.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %867 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%866, %jasper_encoder.encoder.8.res.3.1.weight, %jasper_encoder.encoder.8.res.3.1.bias, %jasper_encoder.encoder.8.res.3.1.running_mean, %jasper_encoder.encoder.8.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %868 : Float(16, 640, 503) = onnx::Add(%865, %867), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %869 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.8.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %870 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%869, %jasper_encoder.encoder.8.res.4.1.weight, %jasper_encoder.encoder.8.res.4.1.bias, %jasper_encoder.encoder.8.res.4.1.running_mean, %jasper_encoder.encoder.8.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %871 : Float(16, 640, 503) = onnx::Add(%868, %870), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %872 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.8.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %873 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%872, %jasper_encoder.encoder.8.res.5.1.weight, %jasper_encoder.encoder.8.res.5.1.bias, %jasper_encoder.encoder.8.res.5.1.running_mean, %jasper_encoder.encoder.8.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %874 : Float(16, 640, 503) = onnx::Add(%871, %873), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %875 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.8.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %876 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%875, %jasper_encoder.encoder.8.res.6.1.weight, %jasper_encoder.encoder.8.res.6.1.bias, %jasper_encoder.encoder.8.res.6.1.running_mean, %jasper_encoder.encoder.8.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %877 : Float(16, 640, 503) = onnx::Add(%874, %876), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %878 : Float(16, 640, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%842, %jasper_encoder.encoder.8.res.7.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %879 : Float(16, 640, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%878, %jasper_encoder.encoder.8.res.7.1.weight, %jasper_encoder.encoder.8.res.7.1.bias, %jasper_encoder.encoder.8.res.7.1.running_mean, %jasper_encoder.encoder.8.res.7.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %880 : Float(16, 640, 503) = onnx::Add(%877, %879), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %881 : Float(16, 640, 503) = onnx::Relu(%880), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[8]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %882 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%881, %jasper_encoder.encoder.9.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %883 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%882, %jasper_encoder.encoder.9.conv.1.weight, %jasper_encoder.encoder.9.conv.1.bias, %jasper_encoder.encoder.9.conv.1.running_mean, %jasper_encoder.encoder.9.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %884 : Float(16, 768, 503) = onnx::Relu(%883), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %885 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%884, %jasper_encoder.encoder.9.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %886 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%885, %jasper_encoder.encoder.9.conv.5.weight, %jasper_encoder.encoder.9.conv.5.bias, %jasper_encoder.encoder.9.conv.5.running_mean, %jasper_encoder.encoder.9.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %887 : Float(16, 768, 503) = onnx::Relu(%886), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %888 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%887, %jasper_encoder.encoder.9.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %889 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%888, %jasper_encoder.encoder.9.conv.9.weight, %jasper_encoder.encoder.9.conv.9.bias, %jasper_encoder.encoder.9.conv.9.running_mean, %jasper_encoder.encoder.9.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %890 : Float(16, 768, 503) = onnx::Relu(%889), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %891 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%890, %jasper_encoder.encoder.9.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %892 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%891, %jasper_encoder.encoder.9.conv.13.weight, %jasper_encoder.encoder.9.conv.13.bias, %jasper_encoder.encoder.9.conv.13.running_mean, %jasper_encoder.encoder.9.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %893 : Float(16, 768, 503) = onnx::Relu(%892), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %894 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%893, %jasper_encoder.encoder.9.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %895 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%894, %jasper_encoder.encoder.9.conv.17.weight, %jasper_encoder.encoder.9.conv.17.bias, %jasper_encoder.encoder.9.conv.17.running_mean, %jasper_encoder.encoder.9.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %896 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.9.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %897 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%896, %jasper_encoder.encoder.9.res.0.1.weight, %jasper_encoder.encoder.9.res.0.1.bias, %jasper_encoder.encoder.9.res.0.1.running_mean, %jasper_encoder.encoder.9.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %898 : Float(16, 768, 503) = onnx::Add(%895, %897), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %899 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.9.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %900 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%899, %jasper_encoder.encoder.9.res.1.1.weight, %jasper_encoder.encoder.9.res.1.1.bias, %jasper_encoder.encoder.9.res.1.1.running_mean, %jasper_encoder.encoder.9.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %901 : Float(16, 768, 503) = onnx::Add(%898, %900), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %902 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.9.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %903 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%902, %jasper_encoder.encoder.9.res.2.1.weight, %jasper_encoder.encoder.9.res.2.1.bias, %jasper_encoder.encoder.9.res.2.1.running_mean, %jasper_encoder.encoder.9.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %904 : Float(16, 768, 503) = onnx::Add(%901, %903), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %905 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.9.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %906 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%905, %jasper_encoder.encoder.9.res.3.1.weight, %jasper_encoder.encoder.9.res.3.1.bias, %jasper_encoder.encoder.9.res.3.1.running_mean, %jasper_encoder.encoder.9.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %907 : Float(16, 768, 503) = onnx::Add(%904, %906), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %908 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.9.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %909 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%908, %jasper_encoder.encoder.9.res.4.1.weight, %jasper_encoder.encoder.9.res.4.1.bias, %jasper_encoder.encoder.9.res.4.1.running_mean, %jasper_encoder.encoder.9.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %910 : Float(16, 768, 503) = onnx::Add(%907, %909), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %911 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.9.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %912 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%911, %jasper_encoder.encoder.9.res.5.1.weight, %jasper_encoder.encoder.9.res.5.1.bias, %jasper_encoder.encoder.9.res.5.1.running_mean, %jasper_encoder.encoder.9.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %913 : Float(16, 768, 503) = onnx::Add(%910, %912), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %914 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.9.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %915 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%914, %jasper_encoder.encoder.9.res.6.1.weight, %jasper_encoder.encoder.9.res.6.1.bias, %jasper_encoder.encoder.9.res.6.1.running_mean, %jasper_encoder.encoder.9.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %916 : Float(16, 768, 503) = onnx::Add(%913, %915), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %917 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%842, %jasper_encoder.encoder.9.res.7.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %918 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%917, %jasper_encoder.encoder.9.res.7.1.weight, %jasper_encoder.encoder.9.res.7.1.bias, %jasper_encoder.encoder.9.res.7.1.running_mean, %jasper_encoder.encoder.9.res.7.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %919 : Float(16, 768, 503) = onnx::Add(%916, %918), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %920 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%881, %jasper_encoder.encoder.9.res.8.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %921 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%920, %jasper_encoder.encoder.9.res.8.1.weight, %jasper_encoder.encoder.9.res.8.1.bias, %jasper_encoder.encoder.9.res.8.1.running_mean, %jasper_encoder.encoder.9.res.8.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %922 : Float(16, 768, 503) = onnx::Add(%919, %921), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %923 : Float(16, 768, 503) = onnx::Relu(%922), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[9]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %924 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%923, %jasper_encoder.encoder.10.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %925 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%924, %jasper_encoder.encoder.10.conv.1.weight, %jasper_encoder.encoder.10.conv.1.bias, %jasper_encoder.encoder.10.conv.1.running_mean, %jasper_encoder.encoder.10.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %926 : Float(16, 768, 503) = onnx::Relu(%925), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %927 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%926, %jasper_encoder.encoder.10.conv.4.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %928 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%927, %jasper_encoder.encoder.10.conv.5.weight, %jasper_encoder.encoder.10.conv.5.bias, %jasper_encoder.encoder.10.conv.5.running_mean, %jasper_encoder.encoder.10.conv.5.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %929 : Float(16, 768, 503) = onnx::Relu(%928), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %930 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%929, %jasper_encoder.encoder.10.conv.8.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %931 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%930, %jasper_encoder.encoder.10.conv.9.weight, %jasper_encoder.encoder.10.conv.9.bias, %jasper_encoder.encoder.10.conv.9.running_mean, %jasper_encoder.encoder.10.conv.9.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %932 : Float(16, 768, 503) = onnx::Relu(%931), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %933 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%932, %jasper_encoder.encoder.10.conv.12.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %934 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%933, %jasper_encoder.encoder.10.conv.13.weight, %jasper_encoder.encoder.10.conv.13.bias, %jasper_encoder.encoder.10.conv.13.running_mean, %jasper_encoder.encoder.10.conv.13.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %935 : Float(16, 768, 503) = onnx::Relu(%934), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Dropout # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %936 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[25], pads=[12, 12], strides=[1]](%935, %jasper_encoder.encoder.10.conv.16.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %937 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%936, %jasper_encoder.encoder.10.conv.17.weight, %jasper_encoder.encoder.10.conv.17.bias, %jasper_encoder.encoder.10.conv.17.running_mean, %jasper_encoder.encoder.10.conv.17.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %938 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%653, %jasper_encoder.encoder.10.res.0.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %939 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%938, %jasper_encoder.encoder.10.res.0.1.weight, %jasper_encoder.encoder.10.res.0.1.bias, %jasper_encoder.encoder.10.res.0.1.running_mean, %jasper_encoder.encoder.10.res.0.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %940 : Float(16, 768, 503) = onnx::Add(%937, %939), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %941 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%671, %jasper_encoder.encoder.10.res.1.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %942 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%941, %jasper_encoder.encoder.10.res.1.1.weight, %jasper_encoder.encoder.10.res.1.1.bias, %jasper_encoder.encoder.10.res.1.1.running_mean, %jasper_encoder.encoder.10.res.1.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %943 : Float(16, 768, 503) = onnx::Add(%940, %942), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %944 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%692, %jasper_encoder.encoder.10.res.2.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %945 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%944, %jasper_encoder.encoder.10.res.2.1.weight, %jasper_encoder.encoder.10.res.2.1.bias, %jasper_encoder.encoder.10.res.2.1.running_mean, %jasper_encoder.encoder.10.res.2.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %946 : Float(16, 768, 503) = onnx::Add(%943, %945), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %947 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%716, %jasper_encoder.encoder.10.res.3.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %948 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%947, %jasper_encoder.encoder.10.res.3.1.weight, %jasper_encoder.encoder.10.res.3.1.bias, %jasper_encoder.encoder.10.res.3.1.running_mean, %jasper_encoder.encoder.10.res.3.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %949 : Float(16, 768, 503) = onnx::Add(%946, %948), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %950 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%743, %jasper_encoder.encoder.10.res.4.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %951 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%950, %jasper_encoder.encoder.10.res.4.1.weight, %jasper_encoder.encoder.10.res.4.1.bias, %jasper_encoder.encoder.10.res.4.1.running_mean, %jasper_encoder.encoder.10.res.4.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %952 : Float(16, 768, 503) = onnx::Add(%949, %951), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %953 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%773, %jasper_encoder.encoder.10.res.5.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %954 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%953, %jasper_encoder.encoder.10.res.5.1.weight, %jasper_encoder.encoder.10.res.5.1.bias, %jasper_encoder.encoder.10.res.5.1.running_mean, %jasper_encoder.encoder.10.res.5.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %955 : Float(16, 768, 503) = onnx::Add(%952, %954), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %956 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%806, %jasper_encoder.encoder.10.res.6.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %957 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%956, %jasper_encoder.encoder.10.res.6.1.weight, %jasper_encoder.encoder.10.res.6.1.bias, %jasper_encoder.encoder.10.res.6.1.running_mean, %jasper_encoder.encoder.10.res.6.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %958 : Float(16, 768, 503) = onnx::Add(%955, %957), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %959 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%842, %jasper_encoder.encoder.10.res.7.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %960 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%959, %jasper_encoder.encoder.10.res.7.1.weight, %jasper_encoder.encoder.10.res.7.1.bias, %jasper_encoder.encoder.10.res.7.1.running_mean, %jasper_encoder.encoder.10.res.7.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %961 : Float(16, 768, 503) = onnx::Add(%958, %960), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %962 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%881, %jasper_encoder.encoder.10.res.8.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %963 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%962, %jasper_encoder.encoder.10.res.8.1.weight, %jasper_encoder.encoder.10.res.8.1.bias, %jasper_encoder.encoder.10.res.8.1.running_mean, %jasper_encoder.encoder.10.res.8.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %964 : Float(16, 768, 503) = onnx::Add(%961, %963), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %965 : Float(16, 768, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%923, %jasper_encoder.encoder.10.res.9.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %966 : Float(16, 768, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%965, %jasper_encoder.encoder.10.res.9.1.weight, %jasper_encoder.encoder.10.res.9.1.bias, %jasper_encoder.encoder.10.res.9.1.running_mean, %jasper_encoder.encoder.10.res.9.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %967 : Float(16, 768, 503) = onnx::Add(%964, %966), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:384:0\n", - " %968 : Float(16, 768, 503) = onnx::Relu(%967), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[10]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %969 : Float(16, 896, 503) = onnx::Conv[dilations=[2], group=1, kernel_shape=[29], pads=[28, 28], strides=[1]](%968, %jasper_encoder.encoder.11.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[11]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %970 : Float(16, 896, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%969, %jasper_encoder.encoder.11.conv.1.weight, %jasper_encoder.encoder.11.conv.1.bias, %jasper_encoder.encoder.11.conv.1.running_mean, %jasper_encoder.encoder.11.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[11]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %971 : Float(16, 896, 503) = onnx::Relu(%970), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[11]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %972 : Float(16, 1024, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%971, %jasper_encoder.encoder.12.conv.0.weight), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[12]/MaskedConv1d # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %973 : Float(16, 1024, 503) = onnx::BatchNormalization[epsilon=0.001, momentum=0.9](%972, %jasper_encoder.encoder.12.conv.1.weight, %jasper_encoder.encoder.12.conv.1.bias, %jasper_encoder.encoder.12.conv.1.running_mean, %jasper_encoder.encoder.12.conv.1.running_var), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[12]/BatchNorm1d # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1670:0\n", - " %974 : Float(16, 1024, 503) = onnx::Relu(%973), scope: JasperEncoderDecoder/JasperEncoder[jasper_encoder]/Sequential[encoder]/JasperBlock[12]/Sequential[out]/Dropout[1] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:807:0\n", - " %975 : Float(16, 29, 503) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%974, %jasper_decoder.decoder_layers.0.weight, %jasper_decoder.decoder_layers.0.bias), scope: JasperEncoderDecoder/JasperDecoderForCTC[jasper_decoder]/Sequential[decoder_layers]/Conv1d[0] # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0\n", - " %976 : Float(16, 503, 29) = onnx::Transpose[perm=[0, 2, 1]](%975), scope: JasperEncoderDecoder/JasperDecoderForCTC[jasper_decoder] # /content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/model.py:213:0\n", - " %LOGITS : Float(16, 503, 29) = onnx::LogSoftmax[axis=2](%976), scope: JasperEncoderDecoder/JasperDecoderForCTC[jasper_decoder] # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1317:0\n", - " return (%LOGITS)\n", - "\n", - "Getting engine\n", - "Building TRT engine ....\n", - "Optimizing for FP16\n", - "Parsing returned True dynamic_shape= True \n", - "\n", - "TRT engine saved at jasper_fp16.plan ...\n", - "Got engine.\n", - "INTERENCE TIME: 78.19635899977584 ms\n", - "TRANSCRIPT: ['when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely']\n" - ], - "name": "stdout" - }, - { - "output_type": "stream", - "text": [ - "tcmalloc: large alloc 1331093504 bytes == 0x176d14000 @ 0x7f83f2370887 0x7f83f0c66c29 0x7f83f0c67afb 0x7f83f0c67bb4 0x7f83f0c67f9c 0x7f83aa56952f 0x7f83aa5697b4 0x7f839fd1f390 0x7f83e6ee9e91 0x7f83e6bad014 0x5669ac 0x50a5c3 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4\n", - "tcmalloc: large alloc 1331093504 bytes == 0x11bb6c000 @ 0x7f83f236e1e7 0x59203c 0x7f83e6eea26d 0x7f83e6bad014 0x5669ac 0x50a5c3 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x509758 0x50a48d 0x50bfb4 0x507d64 0x50ae13 0x634c82\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1331092289\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.\n", - "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 1331092289\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "[TensorRT] WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.\n", - "bash: line 7: 1769 Segmentation fault (core dumped) python ../trt/perf.py --ckpt_path ./jasper_fp16.pt --wav=example1.wav --model_toml=../configs/jasper10x5dr_nomask.toml --make_onnx --onnx_path jasper.onnx --engine_path jasper_fp16.plan --trt_fp16\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "912MBa0BdyTZ" - }, - "source": [ - "### Inference from existing TensorRT FP16 plan\n", - "Inference with an existing plan can be launched with the `--use_existing_engine` flag." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "C9oVQa_zOV1u", - "outputId": "81751fb0-1368-4984-b40f-7e2475320108", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 146 - } - }, - "source": [ - "%%bash\n", - "PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", - "python ../trt/perf.py \\\n", - "--wav=./example1.wav \\\n", - "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", - "--use_existing_engine --engine_path jasper_fp16.plan \\\n", - "--trt_fp16" - ], - "execution_count": 0, - "outputs": [ - { - "output_type": "stream", - "text": [ - "Getting component\n", - "Getting engine\n", - "Got engine.\n", - "INTERENCE TIME: 83.8561909999953 ms\n", - "TRANSCRIPT: ['when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely']\n" - ], - "name": "stdout" - }, - { - "output_type": "stream", - "text": [ - "bash: line 6: 1863 Segmentation fault (core dumped) python ../trt/perf.py --wav=./example1.wav --model_toml=../configs/jasper10x5dr_nomask.toml --use_existing_engine --engine_path jasper_fp16.plan --trt_fp16\n" - ], - "name": "stderr" - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "g8MxXY5GmTc8" - }, - "source": [ - "## Conclusion\n", - "\n", - "In this notebook, we have walked through the complete process of carrying out inference using a pretrained Jasper Pytorch model using NVIDIA TensorRT on Google Colab.\n", - "### What's next\n", - "Now that you are familiar with running Jasper inference with TensorRT using full and automatic mixed precision, you may want to play with your own audio samples.\n", - "\n", - "For information on training a Jasper model using your own data, please check out our Github repo: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper" - ] - }, - { - "cell_type": "code", - "metadata": { - "colab_type": "code", - "id": "Q4_PtLSdMnw3", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "P6KFALzoMVpd", - "colab_type": "code", - "colab": {} - }, - "source": [ - "" - ], - "execution_count": 0, - "outputs": [] + "name": "stdout", + "output_type": "stream", + "text": [ + "Wed Oct 2 02:42:12 2019 \n", + "+-----------------------------------------------------------------------------+\n", + "| NVIDIA-SMI 430.40 Driver Version: 418.67 CUDA Version: 10.1 |\n", + "|-------------------------------+----------------------+----------------------+\n", + "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", + "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", + "|===============================+======================+======================|\n", + "| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |\n", + "| N/A 60C P0 70W / 149W | 69MiB / 11441MiB | 0% Default |\n", + "+-------------------------------+----------------------+----------------------+\n", + " \n", + "+-----------------------------------------------------------------------------+\n", + "| Processes: GPU Memory |\n", + "| GPU PID Type Process name Usage |\n", + "|=============================================================================|\n", + "+-----------------------------------------------------------------------------+\n" + ] } - ] -} \ No newline at end of file + ], + "source": [ + "!nvidia-smi" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "pV3rzgO8-tSK" + }, + "source": [ + "The below code check whether a Tensor core GPU is present." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "colab_type": "code", + "id": "Djyvo8mm9poq", + "outputId": "4ec6cda8-1e68-40b9-de79-54e4285f0b7a" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Tensor Core GPU Present: None\n" + ] + } + ], + "source": [ + "from tensorflow.python.client import device_lib\n", + "\n", + "def check_tensor_core_gpu_present():\n", + " local_device_protos = device_lib.list_local_devices()\n", + " for line in local_device_protos:\n", + " if \"compute capability\" in str(line):\n", + " compute_capability = float(line.physical_device_desc.split(\"compute capability: \")[-1])\n", + " if compute_capability>=7.0:\n", + " return True\n", + " \n", + "print(\"Tensor Core GPU Present:\", check_tensor_core_gpu_present())\n", + "tensor_core_gpu = check_tensor_core_gpu_present()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "FCEfkBAbbaLI" + }, + "source": [ + "2. Next, we clone the NVIDIA Github Deep Learning Example repository and set up the workspace." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 148 + }, + "colab_type": "code", + "id": "y3u_VMjXtAto", + "outputId": "84f4b72c-71cd-4415-d94f-1b87266eab74" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Cloning into 'DeepLearningExamples'...\n", + "remote: Enumerating objects: 110, done.\u001b[K\n", + "remote: Counting objects: 100% (110/110), done.\u001b[K\n", + "remote: Compressing objects: 100% (90/90), done.\u001b[K\n", + "remote: Total 4049 (delta 65), reused 35 (delta 17), pack-reused 3939\u001b[K\n", + "Receiving objects: 100% (4049/4049), 32.29 MiB | 26.48 MiB/s, done.\n", + "Resolving deltas: 100% (1875/1875), done.\n" + ] + } + ], + "source": [ + "!git clone https://github.com/NVIDIA/DeepLearningExamples" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "colab_type": "code", + "id": "-rE46y-ftAuQ", + "outputId": "1bd44631-d98f-40a2-a999-3cc861c63d92" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/notebooks\n" + ] + } + ], + "source": [ + "import os\n", + "\n", + "WORKSPACE_DIR='/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/notebooks'\n", + "os.chdir(WORKSPACE_DIR)\n", + "print (os.getcwd())" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "yBeZjO4JtAwL" + }, + "source": [ + "## Install NVIDIA TensorRT\n", + "\n", + "We will need to install NVIDIA TensorRT 6.0 runtime environment on Colab. First, check the Colab CUDA installed version. As of 2nd Oct 2019, `cuda-10.0` is the CUDA version on Google Colab." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 54 + }, + "colab_type": "code", + "id": "LfygzbP1Lz2b", + "outputId": "a0c985a4-c6f8-47f7-a37a-a5c7e5168fef" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "bin cuda-10.0 games\t lib\t man setup.cfg\tsrc\n", + "cuda etc\t include LICENSE.txt sbin share\txgboost\n" + ] + } + ], + "source": [ + "!ls /usr/local/" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "yl4PyTsnQP3U" + }, + "source": [ + "Next, we will need to install the NVIDIA TensorRT version that match the current Colab CUDA version, following the instruction at https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html#maclearn-net-repo-install." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 935 + }, + "colab_type": "code", + "id": "3WA9N43UTq_c", + "outputId": "13db6406-2eea-45a0-d3b2-222b7668f951" + }, + "outputs": [], + "source": [ + "%%bash\n", + "wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb\n", + "\n", + "dpkg -i nvidia-machine-learning-repo-*.deb\n", + "apt-get update" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "XPsJhzqLXDAm" + }, + "source": [ + "When using the NVIDIA Machine Learning network repository, Ubuntu will be default install TensorRT for the latest CUDA version. The following commands will install libnvinfer6 for an older CUDA version and hold the libnvinfer6 package at this version. Replace 6.0.1 with your version of TensorRT and cuda10.0 with your CUDA version for your Colab environment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "colab_type": "code", + "id": "EbF-JWGfK9Lo", + "outputId": "41268923-4997-4f92-80bb-c718b6e56a64" + }, + "outputs": [], + "source": [ + "%%bash\n", + "version=\"6.0.1-1+cuda10.0\"\n", + "sudo apt-get install libnvinfer6=${version} libnvonnxparsers6=${version} libnvparsers6=${version} libnvinfer-plugin6=${version} libnvinfer-dev=${version} libnvonnxparsers-dev=${version} libnvparsers-dev=${version} libnvinfer-plugin-dev=${version} python-libnvinfer=${version} python3-libnvinfer=${version}\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 243 + }, + "colab_type": "code", + "id": "zZO9cPHHLGoc", + "outputId": "3f4e1307-5fae-4052-993c-0f237b6d4957" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "libnvinfer6 set on hold.\n", + "libnvonnxparsers6 set on hold.\n", + "libnvparsers6 set on hold.\n", + "libnvinfer-plugin6 set on hold.\n", + "libnvinfer-dev set on hold.\n", + "libnvonnxparsers-dev set on hold.\n", + "libnvparsers-dev set on hold.\n", + "libnvinfer-plugin-dev set on hold.\n", + "python-libnvinfer set on hold.\n", + "python3-libnvinfer set on hold.\n", + "W: Target Packages (Packages) is configured multiple times in /etc/apt/sources.list.d/nvidia-machine-learning.list:1 and /etc/apt/sources.list.d/nvidia-ml.list:1\n" + ] + } + ], + "source": [ + "!sudo apt-mark hold libnvinfer6 libnvonnxparsers6 libnvparsers6 libnvinfer-plugin6 libnvinfer-dev libnvonnxparsers-dev libnvparsers-dev libnvinfer-plugin-dev python-libnvinfer python3-libnvinfer" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 224 + }, + "colab_type": "code", + "id": "wOo7YbuhLcUU", + "outputId": "8497ea38-9a86-41a2-d68c-a2310c2a1899" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "hi libnvinfer-dev 6.0.1-1+cuda10.0 amd64 TensorRT development libraries and headers\n", + "hi libnvinfer-plugin-dev 6.0.1-1+cuda10.0 amd64 TensorRT plugin libraries\n", + "hi libnvinfer-plugin6 6.0.1-1+cuda10.0 amd64 TensorRT plugin libraries\n", + "hi libnvinfer6 6.0.1-1+cuda10.0 amd64 TensorRT runtime libraries\n", + "hi libnvonnxparsers-dev 6.0.1-1+cuda10.0 amd64 TensorRT ONNX libraries\n", + "hi libnvonnxparsers6 6.0.1-1+cuda10.0 amd64 TensorRT ONNX libraries\n", + "hi libnvparsers-dev 6.0.1-1+cuda10.0 amd64 TensorRT parsers libraries\n", + "hi libnvparsers6 6.0.1-1+cuda10.0 amd64 TensorRT parsers libraries\n", + "hi python-libnvinfer 6.0.1-1+cuda10.0 amd64 Python bindings for TensorRT\n", + "hi python3-libnvinfer 6.0.1-1+cuda10.0 amd64 Python 3 bindings for TensorRT\n" + ] + } + ], + "source": [ + "!dpkg -l | grep TensorRT" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "XRMZiFjdUPCZ" + }, + "source": [ + "A successful TensorRT installation should look like:\n", + "\n", + "```\n", + "hi libnvinfer-dev 6.0.1-1+cuda10.0 amd64 TensorRT development libraries and headers\n", + "hi libnvinfer-plugin-dev 6.0.1-1+cuda10.0 amd64 TensorRT plugin libraries\n", + "hi libnvinfer-plugin6 6.0.1-1+cuda10.0 amd64 TensorRT plugin libraries\n", + "hi libnvinfer6 6.0.1-1+cuda10.0 amd64 TensorRT runtime libraries\n", + "hi libnvonnxparsers-dev 6.0.1-1+cuda10.0 amd64 TensorRT ONNX libraries\n", + "hi libnvonnxparsers6 6.0.1-1+cuda10.0 amd64 TensorRT ONNX libraries\n", + "hi libnvparsers-dev 6.0.1-1+cuda10.0 amd64 TensorRT parsers libraries\n", + "hi libnvparsers6 6.0.1-1+cuda10.0 amd64 TensorRT parsers libraries\n", + "hi python-libnvinfer 6.0.1-1+cuda10.0 amd64 Python bindings for TensorRT\n", + "hi python3-libnvinfer 6.0.1-1+cuda10.0 amd64 Python 3 bindings for TensorRT\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "lwpwNNw8FS8F" + }, + "source": [ + "## Download pretrained Jasper model from NVIDIA GPU Cloud model repository\n", + "\n", + "NVIDIA provides pretrained Jasper models along with many other deep learning models such as ResNet, BERT, Transformer, SSD... at https://ngc.nvidia.com/catalog/models. Here, we will download and unzip pretrained Jasper Pytorch models." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 204 + }, + "colab_type": "code", + "id": "np8WaN_FFaTF", + "outputId": "34801d4c-a621-4e13-9f69-16b429ebeccf" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "IOPub data rate exceeded.\n", + "The notebook server will temporarily stop sending output\n", + "to the client in order to avoid crashing it.\n", + "To change this limit, set the config variable\n", + "`--NotebookApp.iopub_data_rate_limit`.\n", + "\n", + "Current values:\n", + "NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)\n", + "NotebookApp.rate_limit_window=3.0 (secs)\n", + "\n" + ] + } + ], + "source": [ + "%%bash \n", + "wget -nc -q --show-progress -O jasper_model.zip \\\n", + "https://api.ngc.nvidia.com/v2/models/nvidia/jasperpyt_fp16/versions/1/zip" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 54 + }, + "colab_type": "code", + "id": "gsJBwUgXHEkE", + "outputId": "8e2214fc-6068-498e-aa73-9564642768ec" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Archive: ./jasper_model.zip\n", + " inflating: jasper_fp16.pt \n" + ] + } + ], + "source": [ + "!unzip -o ./jasper_model.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "hbafoHBMXr0E" + }, + "source": [ + "After a successful download, a Pytorch checkpoint named ` jasper_fp16.pt` should exist in the current notebooks directory." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 36 + }, + "colab_type": "code", + "id": "YC2Fu9rWG70U", + "outputId": "8d8d8c9d-2ac3-454e-e2ae-72dcd8d56b81" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "-rw-r--r-- 1 root root 2661855989 Sep 10 00:33 jasper_fp16.pt\n" + ] + } + ], + "source": [ + "!ls -l jasper_fp16.pt " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "z-1-63MOG0um" + }, + "source": [ + "## Install extra dependencies\n", + "\n", + "Before proceeding to creating the TensorRT execution engine from the Pytorch checkpoint, we shall install some extra dependency to load and convert the Pytorch model and process input audio files.\n", + "\n", + "- [Apex](https://nvidia.github.io/apex/): this is NVIDIA libraries for automatic mixed precision training in Pytorch\n", + "- [Onnx](https://github.com/onnx/onnx): for processing ONNX model.\n", + "- unidecode, soundfile, toml, pycuda: miscellaneous helper libraries\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "colab_type": "code", + "id": "A6QC4ngRHw3a", + "outputId": "ba7195a5-b235-47aa-8565-b671e40af831" + }, + "outputs": [], + "source": [ + "%%bash \n", + "pip uninstall -y apex\n", + "git clone https://www.github.com/nvidia/apex\n", + "cd apex\n", + "python setup.py install\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 729 + }, + "colab_type": "code", + "id": "QPJbKeigIdOC", + "outputId": "97f9cb47-e8ea-407e-e252-0ff46684a9e2" + }, + "outputs": [], + "source": [ + "!pip install unidecode soundfile toml pycuda" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 243 + }, + "colab_type": "code", + "id": "UfMZMGbEMFXE", + "outputId": "0724ae09-b195-4354-d16a-76d01ea81f82" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Collecting onnx\n", + "\u001b[?25l Downloading https://files.pythonhosted.org/packages/f5/f4/e126b60d109ad1e80020071484b935980b7cce1e4796073aab086a2d6902/onnx-1.6.0-cp36-cp36m-manylinux1_x86_64.whl (4.8MB)\n", + "\u001b[K |████████████████████████████████| 4.8MB 27kB/s \n", + "\u001b[?25hRequirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from onnx) (1.12.0)\n", + "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from onnx) (1.16.5)\n", + "Collecting typing-extensions>=3.6.2.1 (from onnx)\n", + " Downloading https://files.pythonhosted.org/packages/27/aa/bd1442cfb0224da1b671ab334d3b0a4302e4161ea916e28904ff9618d471/typing_extensions-3.7.4-py3-none-any.whl\n", + "Requirement already satisfied: protobuf in /usr/local/lib/python3.6/dist-packages (from onnx) (3.7.1)\n", + "Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf->onnx) (41.2.0)\n", + "Installing collected packages: typing-extensions, onnx\n", + "Successfully installed onnx-1.6.0 typing-extensions-3.7.4\n" + ] + } + ], + "source": [ + "!pip install onnx" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "Bup2MbIjStm2" + }, + "source": [ + "## Play with audio examples\n", + "\n", + "You can perform inference using pre-trained checkpoints which takes audio file (in .wav format) as input, and produces the corresponding text file. You can customize the content of the input .wav file. For example, there are several examples of input files at \"notebooks\" dirctory and we can listen to example1.wav:" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 61 + }, + "colab_type": "code", + "id": "u7J2WjikSxgu", + "outputId": "3a862d2a-7fa6-4a65-9e84-817af265d74e" + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 19, + "metadata": { + "tags": [] + }, + "output_type": "execute_result" + } + ], + "source": [ + "import IPython.display as ipd\n", + "ipd.Audio('./example1.wav', rate=22050)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "la9uEY8ja9iV" + }, + "source": [ + "You can also download your own audio sample to Colab with\n", + "\n", + "```!wget ```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "571hMhhGRnOB" + }, + "source": [ + "## FP32 Inference with TensorRT\n", + "\n", + "\n", + "### Creating TensorRT FP32 execution plan\n", + "\n", + "You can run inference using the trt/perf.py script:\n", + "* the checkpoint is passed as `--ckpt` argument \n", + "* `--model_toml` specifies the path to network configuration file (see examples in \"config\" directory)\n", + "* `--make_onnx` exports to ONNX file at the path if set\n", + "* `--engine_path` saves the engine file (*.plan) \n", + "\n", + "To create a new engine file (jasper.plan) for TensorRT and run it using fp32 (building the engine for the first time can take several minutes):" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 224 + }, + "colab_type": "code", + "id": "aJrN9pmdG4C8", + "outputId": "35ce0bd5-ead4-4ab1-fd50-8f7062c14e77" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "tcmalloc: large alloc 1331142656 bytes == 0x15c680000 @ 0x7f5e9070c887 0x7f5e8f002bf9 0x7f5e8f003acb 0x7f5e8f003b84 0x7f5e8f003f6c 0x7f5e4a95216f 0x7f5e4a9523f4 0x7f5e4053a411 0x7f5e8684837d 0x7f5e8657def4 0x56204c 0x4f88ba 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f7a28 0x4f876d\n", + "tcmalloc: large alloc 1331142656 bytes == 0x1abbfa000 @ 0x7f5e9070a1e7 0x5a1c5c 0x7f5e868486da 0x7f5e8657def4 0x56204c 0x4f88ba 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f6128 0x4f9023\n", + "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.\n", + "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 1331138693\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n", + "tcmalloc: large alloc 1800912896 bytes == 0x7f5d74a84000 @ 0x7f5e9070c887 0x7f5e11f173ea 0x7f5e11f0a632 0x7f5e120df6d4 0x7f5e11ef638f 0x7f5e1ebca86a 0x7f5e1ec2194a 0x56204c 0x4f88ba 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f6128 0x4f9023 0x6415b2 0x64166a 0x643730 0x62b26e 0x4b4cb0 0x7f5e90307b97 0x5bdf6a\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n", + "INTERENCE TIME: 239.51307100014674 ms\n", + "TRANSCRIPT: when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely\n" + ] + } + ], + "source": [ + "%%bash\n", + "PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", + "python ../trt/perf.py \\\n", + "--ckpt_path ./jasper_fp16.pt --wav=example1.wav \\\n", + "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", + "--make_onnx --onnx_path jasper.onnx \\\n", + "--engine_path jasper.plan" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "OFYVKESRdf-y" + }, + "source": [ + "### Inference from existing TensorRT FP32 plan\n", + "Inference with an existing plan can be launch with the `--use_existing_engine` flag." + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 149 + }, + "colab_type": "code", + "id": "vzQOo6QLODZd", + "outputId": "54bbe948-02f5-4a3f-eeae-bd4fd8da290c" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "INTERENCE TIME: 289.92610499994953 ms\n", + "TRANSCRIPT: when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "tcmalloc: large alloc 1331036160 bytes == 0x62440000 @ 0x7fd170b6f1e7 0x5a1c5c 0x578954 0x561fca 0x57c961 0x57e6ae 0x4bb666 0x4f858d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f6128 0x4f9023 0x6415b2 0x64166a 0x643730 0x62b26e 0x4b4cb0 0x7fd17076cb97 0x5bdf6a\n", + "tcmalloc: large alloc 1330552832 bytes == 0x1010f0000 @ 0x7fd170b71887 0x7fd0f255dce7 0x7fd0f254d05f 0x7fd0f2364ee3 0x7fd0f236efd8 0x7fd0ff01f82e 0x7fd0ff08694a 0x56204c 0x4f88ba 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f6128 0x4f9023 0x6415b2 0x64166a 0x643730 0x62b26e 0x4b4cb0 0x7fd17076cb97 0x5bdf6a\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n" + ] + } + ], + "source": [ + "%%bash\n", + "PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", + "python ../trt/perf.py \\\n", + "--wav=./example1.wav \\\n", + "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", + "--use_existing_engine --engine_path jasper.plan" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "Pg4X1odgOS1i" + }, + "source": [ + "## FP16 Inference with TensorRT\n", + "### Creating TensorRT FP16 execution plan\n", + "\n", + "We will next create an FP16 TRT inference plan. \n", + "\n", + "To run inference of the input audio file using automatic mixed precision, add the argument `--trt_fp16`. Using automatic mixed precision, the inference time can be reduced efficiently compared to that of using fp32 (building the engine for the first time can take several minutes).\n", + "\n", + "**Important Note:** Efficient FP16 inference requires a Volta, Turing or newer generation GPUs. On Google Colab, this normally means a T4 GPU. On the older K80 GPUs, FP16 performance might actually degrade from an FP32 TRT model." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 243 + }, + "colab_type": "code", + "id": "x2n_2cZYdGOg", + "outputId": "0f02ca21-d5b0-4c16-89db-a46bdf57e438" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "INTERENCE TIME: 334.61581900019155 ms\n", + "TRANSCRIPT: when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "tcmalloc: large alloc 1331142656 bytes == 0x15bbec000 @ 0x7f1bf09e6887 0x7f1bef2dcbf9 0x7f1bef2ddacb 0x7f1bef2ddb84 0x7f1bef2ddf6c 0x7f1baac2c16f 0x7f1baac2c3f4 0x7f1ba0814411 0x7f1be6b2237d 0x7f1be6857ef4 0x56204c 0x4f88ba 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f7a28 0x4f876d\n", + "tcmalloc: large alloc 1331142656 bytes == 0x106ce2000 @ 0x7f1bf09e41e7 0x5a1c5c 0x7f1be6b226da 0x7f1be6857ef4 0x56204c 0x4f88ba 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4f98c7 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f6128 0x4f7d60 0x4f876d 0x4fa6c0 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f6128 0x4f9023\n", + "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.\n", + "[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 1331138693\n", + "[TensorRT] WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n", + "tcmalloc: large alloc 1817280512 bytes == 0x7f1ad3ae8000 @ 0x7f1bf09e6887 0x7f1b721f13ea 0x7f1b721e4632 0x7f1b723b96d4 0x7f1b721d038f 0x7f1b7eea486a 0x7f1b7eefb94a 0x56204c 0x4f88ba 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f6128 0x4f9023 0x6415b2 0x64166a 0x643730 0x62b26e 0x4b4cb0 0x7f1bf05e1b97 0x5bdf6a\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n" + ] + } + ], + "source": [ + "%%bash\n", + "PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", + "python ../trt/perf.py \\\n", + "--ckpt_path ./jasper_fp16.pt --wav=example1.wav \\\n", + "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", + "--make_onnx --onnx_path jasper.onnx \\\n", + "--engine_path jasper_fp16.plan \\\n", + "--trt_fp16" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "912MBa0BdyTZ" + }, + "source": [ + "### Inference from existing TensorRT FP16 plan\n", + "Inference with an existing plan can be launched with the `--use_existing_engine` flag." + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 131 + }, + "colab_type": "code", + "id": "C9oVQa_zOV1u", + "outputId": "2de02886-853c-4deb-9a9c-234a338abec0" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "INTERENCE TIME: 301.42106899984356 ms\n", + "TRANSCRIPT: when these two souls perceived each other they recognized each other as necessary to each other and embraced each other closely\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "tcmalloc: large alloc 1116463104 bytes == 0xb1754000 @ 0x7f0533d0e1e7 0x5a1c5c 0x578954 0x561fca 0x57c961 0x57e6ae 0x4bb666 0x4f858d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f7a28 0x4f876d 0x4f98c7 0x4f6128 0x4f9023 0x6415b2 0x64166a 0x643730 0x62b26e 0x4b4cb0 0x7f053390bb97 0x5bdf6a\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n", + "[TensorRT] WARNING: TensorRT was linked against cuDNN 7.6.3 but loaded cuDNN 7.6.2\n" + ] + } + ], + "source": [ + "%%bash\n", + "PYTHONPATH=/content/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper \n", + "python ../trt/perf.py \\\n", + "--wav=./example1.wav \\\n", + "--model_toml=../configs/jasper10x5dr_nomask.toml \\\n", + "--use_existing_engine --engine_path jasper_fp16.plan \\\n", + "--trt_fp16" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text", + "id": "g8MxXY5GmTc8" + }, + "source": [ + "## Conclusion\n", + "\n", + "In this notebook, we have walked through the complete process of carrying out inference using a pretrained Jasper Pytorch model using NVIDIA TensorRT on Google Colab.\n", + "### What's next\n", + "Now that you are familiar with running Jasper inference with TensorRT using full and automatic mixed precision, you may want to play with your own audio samples.\n", + "\n", + "For information on training a Jasper model using your own data, please check out our Github repo: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab": {}, + "colab_type": "code", + "id": "Q4_PtLSdMnw3" + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "include_colab_link": true, + "name": "Colab_Jasper_TRT_inference_demo.ipynb", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.8" + } + }, + "nbformat": 4, + "nbformat_minor": 1 +} diff --git a/PyTorch/SpeechRecognition/Jasper/notebooks/JasperTRTIS.ipynb b/PyTorch/SpeechRecognition/Jasper/notebooks/JasperTRTIS.ipynb deleted file mode 100644 index beb1d19e..00000000 --- a/PyTorch/SpeechRecognition/Jasper/notebooks/JasperTRTIS.ipynb +++ /dev/null @@ -1,274 +0,0 @@ -{ - "cells": [ - { - "cell_type": "raw", - "metadata": {}, - "source": [ - "# Copyright 2019 NVIDIA Corporation. All Rights Reserved.\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# http://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License.\n", - "# ==============================================================================" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "# Jasper inference using TensorRT Inference Server\n", - "This Jupyter notebook provides scripts to deploy high-performance inference in NVIDIA TensorRT Inference Server offering different options for the model backend, among others NVIDIA TensorRT. \n", - "Jasper is a neural acoustic model for speech recognition. Its network architecture is designed to facilitate fast GPU inference. \n", - "NVIDIA TensorRT Inference Server provides a datacenter and cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any number of GPU or CPU models being managed by the server\n", - "NVIDIA TensorRT is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications.\n", - "## 1. Overview\n", - "\n", - "The Jasper model is an end-to-end neural acoustic model for automatic speech recognition (ASR) that provides near state-of-the-art results on LibriSpeech among end-to-end ASR models without any external data. The Jasper architecture of convolutional layers was designed to facilitate fast GPU inference, by allowing whole sub-blocks to be fused into a single GPU kernel. This is important for meeting strict real-time requirements of ASR systems in deployment.The results of the acoustic model are combined with the results of external language models to get the top-ranked word sequences corresponding to a given audio segment. This post-processing step is called decoding.\n", - "\n", - "The original paper is Jasper: An End-to-End Convolutional Neural Acoustic Model https://arxiv.org/pdf/1904.03288.pdf.\n", - "\n", - "### 1.1 Model architecture\n", - "By default the model configuration is Jasper 10x5 with dense residuals. A Jasper BxR model has B blocks, each consisting of R repeating sub-blocks.\n", - "Each sub-block applies the following operations in sequence: 1D-Convolution, Batch Normalization, ReLU activation, and Dropout. \n", - "In the original paper Jasper is trained with masked convolutions, which masks out the padded part of an input sequence in a batch before the 1D-Convolution.\n", - "For inference masking is not used. The reason for this is that in inference, the original mask operation does not achieve better accuracy than without the mask operation on the test and development dataset. However, no masking achieves better inference performance especially after TensorRT optimization.\n", - "More information on the model architecture can be found in the [Jasper Pytorch README](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper)\n", - "\n", - "### 1.2 TensorRT Inference Server Overview\n", - "\n", - "A typical TensorRT Inference Server pipeline can be broken down into the following 8 steps:\n", - "1. Client serializes the inference request into a message and sends it to the server (Client Send)\n", - "2. Message travels over the network from the client to the server (Network)\n", - "3. Message arrives at server, and is deserialized (Server Receive)\n", - "4. Request is placed on the queue (Server Queue)\n", - "5. Request is removed from the queue and computed (Server Compute)\n", - "6. Completed request is serialized in a message and sent back to the client (Server Send)\n", - "7. Completed message travels over network from the server to the client (Network)\n", - "8. Completed message is deserialized by the client and processed as a completed inference request (Client Receive)\n", - "\n", - "Generally, for local clients, steps 1-4 and 6-8 will only occupy a small fraction of time, compared to steps 5-6. As backend deep learning systems like Jasper are rarely exposed directly to end users, but instead only interfacing with local front-end servers, for the sake of Jasper, we can consider that all clients are local.\n", - "In this section, we will go over how to launch TensorRT Inference Server and client and get the best performant solution that fits your specific application needs.\n", - "\n", - "Note: The following instructions are run from outside the container and call `docker run` commands as required.\n", - "\n", - "### 1.3 Inference Pipeline in TensorRT Inference Server\n", - "The Jasper model pipeline consists of 3 components, where each part can be customized to be a different backend: \n", - "\n", - "**Data preprocessor**\n", - "\n", - "The data processor transforms an input raw audio file into a spectrogram. By default the pipeline uses mel filter banks as spectrogram features. This part does not have any learnable weights.\n", - "\n", - "**Acoustic model**\n", - "\n", - "The acoustic model takes in the spectrogram and outputs a probability over a list of characters. This part is the most compute intensive, taking more than 90% of the entire end-to-end pipeline. The acoustic model is the only component with learnable parameters and what differentiates Jasper from other end-to-end neural speech recognition models. In the original paper, the acoustic model contains a masking operation for training (More details in [Jasper PyTorch README](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechRecognition/Jasper/README.md)). We do not use masking for inference . \n", - "\n", - "**Greedy decoder**\n", - "\n", - "The decoder takes the probabilities over the list of characters and outputs the final transcription. Greedy decoding is a fast and simple way of doing this by always choosing the character with the maximum probability. \n", - "\n", - "To run a model with TensorRT, we first construct the model in PyTorch, which is then exported into a ONNX static graph. Finally, a TensorRT engine is constructed from the ONNX file and can be launched to do inference. The following table shows which backends are supported for each part along the model pipeline.\n", - "\n", - "|Backend\\Pipeline component|Data preprocessor|Acoustic Model|Decoder|\n", - "|---|---|---|---|\n", - "|PyTorch JIT|x|x|x|\n", - "|ONNX|-|x|-|\n", - "|TensorRT|-|x|-|\n", - "\n", - "In order to run inference with TensorRT outside of the inference server, refer to the [Jasper TensorRT README](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechRecognition/Jasper/trt/README.md)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 1.3 Learning objectives\n", - "\n", - "This notebook demonstrates:\n", - "- Speed up Jasper Inference with TensorRT in TensorRT Inference Server\n", - "- Use of Mixed Precision for Inference" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2. Requirements\n", - "\n", - "Please refer to Jasper TensorRT Inference Server README.md" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3. Jasper Inference\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.1 Prepare Working Directory" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "if not 'workbookDir' in globals():\n", - " workbookDir = os.getcwd() + \"/../\"\n", - "print('workbookDir: ' + workbookDir)\n", - "os.chdir(workbookDir)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.2 Generate TRTIS Model Checkpoints\n", - "Use the PyTorch model checkpoint to generate all 3 model backends. You can find a pretrained checkpoint at https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16.\n", - "\n", - "Set the following parameters:\n", - "\n", - "* `ARCH`: hardware architecture. use 70 for Volta, 75 for Turing.\n", - "* `CHECKPOINT_DIR`: absolute path to model checkpoint directory.\n", - "* `CHECKPOINT`: model checkpoint name. (default: jasper10x5dr.pt)\n", - "* `PRECISION`: model precision. Default is using mixed precision.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%env ARCH=70\n", - "# replace with absolute path to checkpoint directory, which should include CHECKPOINT file\n", - "%env CHECKPOINT_DIR= \n", - "# CHECKPOINT file name\n", - "%env CHECKPOINT=jasper_fp16.pt \n", - "%env PRECISION=fp16\n", - "!echo \"ARCH=${ARCH} CHECKPOINT_DIR=${CHECKPOINT_DIR} CHECKPOINT=${CHECKPOINT} PRECISION=${PRECISION} trtis/scripts/export_model.sh\"\n", - "!ARCH=${ARCH} CHECKPOINT_DIR=${CHECKPOINT_DIR} CHECKPOINT=${CHECKPOINT} PRECISION=${PRECISION} trtis/scripts/export_model.sh" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!bash trtis/scripts/prepare_model_repository.sh" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.3 Start the TensorRT Inference Server using Docker" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!bash trtis/scripts/run_server.sh" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.4. Start inference prediction in TRTIS\n", - "\n", - "Use the following script to run inference with TensorRT Inference Server.\n", - "You will need to set the parameters such as: \n", - "\n", - "\n", - "* `MODEL_TYPE`: Model pipeline type. Choose from [pyt, onnx, trt] for Pytorch JIT, ONNX, or TensorRT model pipeline.\n", - "* `DATA_DIR`: absolute path to directory with audio files\n", - "* `FILE`: relative path of audio file to `DATA_DIR`\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "MODEL_TYPE=\"trt\"\n", - "DATA_DIR=os.path.join(workbookDir, \"notebooks/\")\n", - "FILE=\"example1.wav\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!bash trtis/scripts/run_client.sh $MODEL_TYPE $DATA_DIR $FILE" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can play with other examples from the 'notebooks' directory. You can also add your own audio files and generate the output text files in this way." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.5. Stop your container in the end" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!docker stop jasper-trtis" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.3" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/PyTorch/SpeechRecognition/Jasper/notebooks/README.md b/PyTorch/SpeechRecognition/Jasper/notebooks/README.md index e8574d54..072ed740 100644 --- a/PyTorch/SpeechRecognition/Jasper/notebooks/README.md +++ b/PyTorch/SpeechRecognition/Jasper/notebooks/README.md @@ -150,4 +150,54 @@ Use the token listed in the output from running the jupyter command to log in, f ## Jasper Jupyter Notebook for TensorRT Inference Server -This notebook can be executed from Google [Colab](https://colab.research.google.com) by supplying the notebook Github [URL](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechRecognition/Jasper/notebooks/Colab_Jasper_TRT_inference_demo.ipynb) or by open this [link](https://colab.research.google.com/github/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechRecognition/Jasper/notebooks/Colab_Jasper_TRT_inference_demo.ipynb) directly. +### Requirements + +`./trtis/` contains a Dockerfile which extends the PyTorch 19.09-py3 NGC container and encapsulates some dependencies. Aside from these dependencies, ensure you have the following components: + +* [NVIDIA Turing](https://www.nvidia.com/en-us/geforce/turing/) or [Volta](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/) based GPU +* [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker) +* [PyTorch 19.09-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) +* [TensorRT Inference Server 19.09 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:tensorrtserver) +* [NVIDIA machine learning repository](https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb) and [NVIDIA cuda repository](https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb) for NVIDIA TensorRT 6 +* [NVIDIA Volta](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/) or [Turing](https://www.nvidia.com/en-us/geforce/turing/) based GPU +* [Pretrained Jasper Model Checkpoint](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16) + +### Quick Start Guide + + +#### 1. Clone the repository. + +``` +git clone https://github.com/NVIDIA/DeepLearningExamples +cd DeepLearningExamples/PyTorch/SpeechRecognition/Jasper +``` + +#### 2. Build a container that extends NGC PyTorch 19.09, TensorRT, TensorRT Inference Server, and TensorRT Inference Client. + +``` +bash trtis/scripts/docker/build.sh +``` + +#### 3. Download the checkpoint +Download the checkpoint file jasper_fp16.pt from NGC Model Repository: +- https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16 + +to an user specified directory _CHECKPOINT_DIR_ + +#### 4. Run the notebook + +For running the notebook on your local machine, run: + +``` +jupyter notebook -- notebooks/JasperTRTIS.ipynb +``` + +For running the notebook on another machine remotely, run: + +``` +jupyter notebook --ip=0.0.0.0 --allow-root +``` + +And navigate a web browser to the IP address or hostname of the host machine at port 8888: `http://[host machine]:8888` + +Use the token listed in the output from running the jupyter command to log in, for example: `http://[host machine]:8888/?token=aae96ae9387cd28151868fee318c3b3581a2d794f3b25c6b` diff --git a/PyTorch/SpeechRecognition/Jasper/parts/features.py b/PyTorch/SpeechRecognition/Jasper/parts/features.py deleted file mode 100644 index 1d0611f2..00000000 --- a/PyTorch/SpeechRecognition/Jasper/parts/features.py +++ /dev/null @@ -1,368 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import torch -import torch.nn as nn -import math -import librosa -from .perturb import AudioAugmentor -from .segment import AudioSegment -from apex import amp - - -def audio_from_file(file_path, offset=0, duration=0, trim=False, target_sr=16000, - device=torch.device('cuda')): - audio = AudioSegment.from_file(file_path, - target_sr=target_sr, - int_values=False, - offset=offset, duration=duration, trim=trim) - samples=torch.tensor(audio.samples, dtype=torch.float, device=device) - num_samples = torch.tensor(samples.shape[0], device=device).int() - return (samples.unsqueeze(0), num_samples.unsqueeze(0)) - -class WaveformFeaturizer(object): - def __init__(self, input_cfg, augmentor=None): - self.augmentor = augmentor if augmentor is not None else AudioAugmentor() - self.cfg = input_cfg - - def max_augmentation_length(self, length): - return self.augmentor.max_augmentation_length(length) - - def process(self, file_path, offset=0, duration=0, trim=False): - audio = AudioSegment.from_file(file_path, - target_sr=self.cfg['sample_rate'], - int_values=self.cfg.get('int_values', False), - offset=offset, duration=duration, trim=trim) - return self.process_segment(audio) - - def process_segment(self, audio_segment): - self.augmentor.perturb(audio_segment) - return torch.tensor(audio_segment.samples, dtype=torch.float) - - @classmethod - def from_config(cls, input_config, perturbation_configs=None): - if perturbation_configs is not None: - aa = AudioAugmentor.from_config(perturbation_configs) - else: - aa = None - - return cls(input_config, augmentor=aa) - - -# @torch.jit.script -# def normalize_batch_per_feature(x, seq_len): -# x_mean = torch.zeros((seq_len.shape[0], x.shape[1]), dtype=x.dtype, device=x.device) -# x_std = torch.zeros((seq_len.shape[0], x.shape[1]), dtype=x.dtype, device=x.device) - -# for i in range(x.shape[0]): -# x_mean[i, :] = x[i, :, :seq_len[i]].mean(dim=1) -# x_std[i, :] = x[i, :, :seq_len[i]].std(dim=1) -# # make sure x_std is not zero -# x_std += 1e-5 -# return (x - x_mean.unsqueeze(2)) / x_std.unsqueeze(2) - -# @torch.jit.script -# def normalize_batch_all_features(x, seq_len): -# x_mean = torch.zeros(seq_len.shape, dtype=x.dtype, device=x.device) -# x_std = torch.zeros(seq_len.shape, dtype=x.dtype, device=x.device) -# for i in range(x.shape[0]): -# x_mean[i] = x[i, :, :int(seq_len[i])].mean() -# x_std[i] = x[i, :, :int(seq_len[i])].std() -# # make sure x_std is not zero -# x_std += 1e-5 -# return (x - x_mean.view(-1, 1, 1)) / x_std.view(-1, 1, 1) - -@torch.jit.script -def normalize_batch(x, seq_len, normalize_type: str): -# print ("normalize_batch: x, seq_len, shapes: ", x.shape, seq_len, seq_len.shape) - if normalize_type == "per_feature": - x_mean = torch.zeros((seq_len.shape[0], x.shape[1]), dtype=x.dtype, - device=x.device) - x_std = torch.zeros((seq_len.shape[0], x.shape[1]), dtype=x.dtype, - device=x.device) - for i in range(x.shape[0]): - x_mean[i, :] = x[i, :, :seq_len[i]].mean(dim=1) - x_std[i, :] = x[i, :, :seq_len[i]].std(dim=1) - # make sure x_std is not zero - x_std += 1e-5 - return (x - x_mean.unsqueeze(2)) / x_std.unsqueeze(2) - elif normalize_type == "all_features": - x_mean = torch.zeros(seq_len.shape, dtype=x.dtype, device=x.device) - x_std = torch.zeros(seq_len.shape, dtype=x.dtype, device=x.device) - for i in range(x.shape[0]): - x_mean[i] = x[i, :, :int(seq_len[i])].mean() - x_std[i] = x[i, :, :int(seq_len[i])].std() - # make sure x_std is not zero - x_std += 1e-5 - return (x - x_mean.view(-1, 1, 1)) / x_std.view(-1, 1, 1) - else: - return x - -@torch.jit.script -def splice_frames(x, frame_splicing: int): - """ Stacks frames together across feature dim - - input is batch_size, feature_dim, num_frames - output is batch_size, feature_dim*frame_splicing, num_frames - - """ - seq = [x] - # TORCHSCRIPT: JIT doesnt like range(start, stop) - for n in range(frame_splicing - 1): - seq.append(torch.cat([x[:, :, :n + 1], x[:, :, n + 1:]], dim=2)) - return torch.cat(seq, dim=1) - -class SpectrogramFeatures(nn.Module): - # For JIT. See https://pytorch.org/docs/stable/jit.html#python-defined-constants - __constants__ = ["dither", "preemph", "n_fft", "hop_length", "win_length", "log", "frame_splicing", "window", "normalize", "pad_to", "max_duration", "do_normalize"] - - def __init__(self, sample_rate=8000, window_size=0.02, window_stride=0.01, - n_fft=None, - window="hamming", normalize="per_feature", log=True, - dither=1e-5, pad_to=8, max_duration=16.7, - frame_splicing=1): - super(SpectrogramFeatures, self).__init__() - torch_windows = { - 'hann': torch.hann_window, - 'hamming': torch.hamming_window, - 'blackman': torch.blackman_window, - 'bartlett': torch.bartlett_window, - 'none': None, - } - self.win_length = int(sample_rate * window_size) - self.hop_length = int(sample_rate * window_stride) - self.n_fft = n_fft or 2 ** math.ceil(math.log2(self.win_length)) - - window_fn = torch_windows.get(window, None) - window_tensor = window_fn(self.win_length, - periodic=False) if window_fn else None - self.window = window_tensor - - self.normalize = normalize - self.log = log - self.dither = dither - self.pad_to = pad_to - self.frame_splicing = frame_splicing - - max_length = 1 + math.ceil( - (max_duration * sample_rate - self.win_length) / self.hop_length - ) - max_pad = 16 - (max_length % 16) - self.max_length = max_length + max_pad - - def get_seq_len(self, seq_len): - return torch.ceil(seq_len.to(dtype=torch.float) / self.hop_length).to( - dtype=torch.int) - - @torch.no_grad() - def forward(self, x, seq_len): - dtype = x.dtype - - seq_len = self.get_seq_len(seq_len) - - # dither - if self.dither > 0: - x += self.dither * torch.randn_like(x) - - # do preemphasis - if hasattr(self,'preemph') and self.preemph is not None: - x = torch.cat((x[:, 0].unsqueeze(1), x[:, 1:] - self.preemph * x[:, :-1]), - dim=1) - - # get spectrogram - x = torch.stft(x, n_fft=self.n_fft, hop_length=self.hop_length, - win_length=self.win_length, - window=self.window.to(torch.float)) - x = torch.sqrt(x.pow(2).sum(-1)) - - # log features if required - if self.log: - x = torch.log(x + 1e-20) - - # frame splicing if required - if self.frame_splicing > 1: - x = splice_frames(x, self.frame_splicing) - - # normalize if required - x = normalize_batch(x, seq_len, normalize_type=self.normalize) - - # mask to zero any values beyond seq_len in batch, pad to multiple of `pad_to` (for efficiency) - max_len = x.size(-1) - mask = torch.arange(max_len, dtype=seq_len.dtype).to(seq_len.device).expand(x.size(0), max_len) >= seq_len.unsqueeze(1) - x = x.masked_fill(mask.unsqueeze(1).to(device=x.device), 0) - - # TORCHSCRIPT: Is this del important? It breaks scripting - #del mask - - pad_to = self.pad_to - - # TORCHSCRIPT: Cant have mixed types. Using pad_to < 0 for "max" - if pad_to < 0: - x = nn.functional.pad(x, (0, self.max_length - x.size(-1))) - elif pad_to > 0: - pad_amt = x.size(-1) % pad_to - if pad_amt != 0: - x = nn.functional.pad(x, (0, pad_to - pad_amt)) - return x.to(dtype) - - @classmethod - def from_config(cls, cfg, log=False): - return cls(sample_rate=cfg['sample_rate'], window_size=cfg['window_size'], - window_stride=cfg['window_stride'], - n_fft=cfg['n_fft'], window=cfg['window'], - normalize=cfg['normalize'], - max_duration=cfg.get('max_duration', 16.7), - dither=cfg.get('dither', 1e-5), pad_to=cfg.get("pad_to", 0), - frame_splicing=cfg.get("frame_splicing", 1), log=log) -constant=1e-5 -class FilterbankFeatures(nn.Module): - # For JIT. See https://pytorch.org/docs/stable/jit.html#python-defined-constants - __constants__ = ["dither", "preemph", "n_fft", "hop_length", "win_length", "center", "log", "frame_splicing", "window", "normalize", "pad_to", "max_duration", "max_length"] - - def __init__(self, sample_rate=8000, window_size=0.02, window_stride=0.01, - window="hamming", normalize="per_feature", n_fft=None, - preemph=0.97, - nfilt=64, lowfreq=0, highfreq=None, log=True, dither=constant, - pad_to=8, - max_duration=16.7, - frame_splicing=1): - super(FilterbankFeatures, self).__init__() - - torch_windows = { - 'hann': torch.hann_window, - 'hamming': torch.hamming_window, - 'blackman': torch.blackman_window, - 'bartlett': torch.bartlett_window, - 'none': None, - } - - self.win_length = int(sample_rate * window_size) # frame size - self.hop_length = int(sample_rate * window_stride) - self.n_fft = n_fft or 2 ** math.ceil(math.log2(self.win_length)) - - self.normalize = normalize - self.log = log - #TORCHSCRIPT: Check whether or not we need this - self.dither = dither - self.frame_splicing = frame_splicing - self.nfilt = nfilt - self.preemph = preemph - self.pad_to = pad_to - highfreq = highfreq or sample_rate / 2 - window_fn = torch_windows.get(window, None) - window_tensor = window_fn(self.win_length, - periodic=False) if window_fn else None - filterbanks = torch.tensor( - librosa.filters.mel(sample_rate, self.n_fft, n_mels=nfilt, fmin=lowfreq, - fmax=highfreq), dtype=torch.float).unsqueeze(0) - # self.fb = filterbanks - # self.window = window_tensor - self.register_buffer("fb", filterbanks) - self.register_buffer("window", window_tensor) - # Calculate maximum sequence length (# frames) - max_length = 1 + math.ceil( - (max_duration * sample_rate - self.win_length) / self.hop_length - ) - max_pad = 16 - (max_length % 16) - self.max_length = max_length + max_pad - - def get_seq_len(self, seq_len): - return torch.ceil(seq_len.to(dtype=torch.float) / self.hop_length).to( - dtype=torch.int) - - # do stft - # TORCHSCRIPT: center removed due to bug - def stft(self, x): - return torch.stft(x, n_fft=self.n_fft, hop_length=self.hop_length, - win_length=self.win_length, - window=self.window.to(dtype=torch.float)) - def forward(self, x, seq_len): - dtype = x.dtype - - seq_len = self.get_seq_len(seq_len) - - # dither - if self.dither > 0: - x += self.dither * torch.randn_like(x) - - # do preemphasis - if self.preemph is not None: - x = torch.cat((x[:, 0].unsqueeze(1), x[:, 1:] - self.preemph * x[:, :-1]), - dim=1) - - x = self.stft(x) - - # get power spectrum - x = x.pow(2).sum(-1) - - # dot with filterbank energies - x = torch.matmul(self.fb.to(x.dtype), x) - - # log features if required - if self.log: - x = torch.log(x + 1e-20) - - # frame splicing if required - if self.frame_splicing > 1: - x = splice_frames(x, self.frame_splicing) - - # normalize if required - x = normalize_batch(x, seq_len, normalize_type=self.normalize) - - # mask to zero any values beyond seq_len in batch, pad to multiple of `pad_to` (for efficiency) - max_len = x.size(-1) - mask = torch.arange(max_len, dtype=seq_len.dtype).to(x.device).expand(x.size(0), - max_len) >= seq_len.unsqueeze(1) - - x = x.masked_fill(mask.unsqueeze(1), 0) - # TORCHSCRIPT: Is this del important? It breaks scripting - # del mask - # TORCHSCRIPT: Cant have mixed types. Using pad_to < 0 for "max" - if self.pad_to < 0: - x = nn.functional.pad(x, (0, self.max_length - x.size(-1))) - elif self.pad_to > 0: - pad_amt = x.size(-1) % self.pad_to - # if pad_amt != 0: - x = nn.functional.pad(x, (0, self.pad_to - pad_amt)) - - return x # .to(dtype) - - @classmethod - def from_config(cls, cfg, log=False): - return cls(sample_rate=cfg['sample_rate'], window_size=cfg['window_size'], - window_stride=cfg['window_stride'], n_fft=cfg['n_fft'], - nfilt=cfg['features'], window=cfg['window'], - normalize=cfg['normalize'], - max_duration=cfg.get('max_duration', 16.7), - dither=cfg['dither'], pad_to=cfg.get("pad_to", 0), - frame_splicing=cfg.get("frame_splicing", 1), log=log) - -class FeatureFactory(object): - featurizers = { - "logfbank": FilterbankFeatures, - "fbank": FilterbankFeatures, - "stft": SpectrogramFeatures, - "logspect": SpectrogramFeatures, - "logstft": SpectrogramFeatures - } - - def __init__(self): - pass - - @classmethod - def from_config(cls, cfg): - feat_type = cfg.get('feat_type', "logspect") - featurizer = cls.featurizers[feat_type] - #return featurizer.from_config(cfg, log="log" in cfg['feat_type']) - return featurizer.from_config(cfg, log="log" in feat_type) diff --git a/PyTorch/SpeechRecognition/Jasper/parts/manifest.py b/PyTorch/SpeechRecognition/Jasper/parts/manifest.py deleted file mode 100644 index 08cd7b56..00000000 --- a/PyTorch/SpeechRecognition/Jasper/parts/manifest.py +++ /dev/null @@ -1,170 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import json -import re -import string -import numpy as np -import os - -from .text import _clean_text - - -def normalize_string(s, labels, table, **unused_kwargs): - """ - Normalizes string. For example: - 'call me at 8:00 pm!' -> 'call me at eight zero pm' - - Args: - s: string to normalize - labels: labels used during model training. - - Returns: - Normalized string - """ - - def good_token(token, labels): - s = set(labels) - for t in token: - if not t in s: - return False - return True - - try: - text = _clean_text(s, ["english_cleaners"], table).strip() - return ''.join([t for t in text if good_token(t, labels=labels)]) - except: - print("WARNING: Normalizing {} failed".format(s)) - return None - -class Manifest(object): - def __init__(self, data_dir, manifest_paths, labels, blank_index, max_duration=None, pad_to_max=False, - min_duration=None, sort_by_duration=False, max_utts=0, - normalize=True, speed_perturbation=False, filter_speed=1.0): - self.labels_map = dict([(labels[i], i) for i in range(len(labels))]) - self.blank_index = blank_index - self.max_duration= max_duration - ids = [] - duration = 0.0 - filtered_duration = 0.0 - - # If removing punctuation, make a list of punctuation to remove - table = None - if normalize: - # Punctuation to remove - punctuation = string.punctuation - punctuation = punctuation.replace("+", "") - punctuation = punctuation.replace("&", "") - ### We might also want to consider: - ### @ -> at - ### # -> number, pound, hashtag - ### ~ -> tilde - ### _ -> underscore - ### % -> percent - # If a punctuation symbol is inside our vocab, we do not remove from text - for l in labels: - punctuation = punctuation.replace(l, "") - # Turn all punctuation to whitespace - table = str.maketrans(punctuation, " " * len(punctuation)) - for manifest_path in manifest_paths: - with open(manifest_path, "r", encoding="utf-8") as fh: - a=json.load(fh) - for data in a: - files_and_speeds = data['files'] - - if pad_to_max: - if not speed_perturbation: - min_speed = filter_speed - else: - min_speed = min(x['speed'] for x in files_and_speeds) - max_duration = self.max_duration * min_speed - - data['duration'] = data['original_duration'] - if min_duration is not None and data['duration'] < min_duration: - filtered_duration += data['duration'] - continue - if max_duration is not None and data['duration'] > max_duration: - filtered_duration += data['duration'] - continue - - # Prune and normalize according to transcript - transcript_text = data[ - 'transcript'] if "transcript" in data else self.load_transcript( - data['text_filepath']) - if normalize: - transcript_text = normalize_string(transcript_text, labels=labels, - table=table) - if not isinstance(transcript_text, str): - print( - "WARNING: Got transcript: {}. It is not a string. Dropping data point".format( - transcript_text)) - filtered_duration += data['duration'] - continue - data["transcript"] = self.parse_transcript(transcript_text) # convert to vocab indices - - if speed_perturbation: - audio_paths = [x['fname'] for x in files_and_speeds] - data['audio_duration'] = [x['duration'] for x in files_and_speeds] - else: - audio_paths = [x['fname'] for x in files_and_speeds if x['speed'] == filter_speed] - data['audio_duration'] = [x['duration'] for x in files_and_speeds if x['speed'] == filter_speed] - data['audio_filepath'] = [os.path.join(data_dir, x) for x in audio_paths] - data.pop('files') - data.pop('original_duration') - - ids.append(data) - duration += data['duration'] - - if max_utts > 0 and len(ids) >= max_utts: - print( - 'Stopping parsing %s as max_utts=%d' % (manifest_path, max_utts)) - break - - if sort_by_duration: - ids = sorted(ids, key=lambda x: x['duration']) - self._data = ids - self._size = len(ids) - self._duration = duration - self._filtered_duration = filtered_duration - - def load_transcript(self, transcript_path): - with open(transcript_path, 'r', encoding="utf-8") as transcript_file: - transcript = transcript_file.read().replace('\n', '') - return transcript - - def parse_transcript(self, transcript): - chars = [self.labels_map.get(x, self.blank_index) for x in list(transcript)] - transcript = list(filter(lambda x: x != self.blank_index, chars)) - return transcript - - def __getitem__(self, item): - return self._data[item] - - def __len__(self): - return self._size - - def __iter__(self): - return iter(self._data) - - @property - def duration(self): - return self._duration - - @property - def filtered_duration(self): - return self._filtered_duration - - @property - def data(self): - return list(self._data) diff --git a/PyTorch/SpeechRecognition/Jasper/parts/perturb.py b/PyTorch/SpeechRecognition/Jasper/parts/perturb.py deleted file mode 100644 index b8ff0f50..00000000 --- a/PyTorch/SpeechRecognition/Jasper/parts/perturb.py +++ /dev/null @@ -1,111 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import random -import librosa -from .manifest import Manifest -from .segment import AudioSegment - - -class Perturbation(object): - def max_augmentation_length(self, length): - return length - - def perturb(self, data): - raise NotImplementedError - - -class SpeedPerturbation(Perturbation): - def __init__(self, min_speed_rate=0.85, max_speed_rate=1.15, rng=None): - self._min_rate = min_speed_rate - self._max_rate = max_speed_rate - self._rng = random.Random() if rng is None else rng - - def max_augmentation_length(self, length): - return length * self._max_rate - - def perturb(self, data): - speed_rate = self._rng.uniform(self._min_rate, self._max_rate) - if speed_rate <= 0: - raise ValueError("speed_rate should be greater than zero.") - data._samples = librosa.effects.time_stretch(data._samples, speed_rate) - - -class GainPerturbation(Perturbation): - def __init__(self, min_gain_dbfs=-10, max_gain_dbfs=10, rng=None): - self._min_gain_dbfs = min_gain_dbfs - self._max_gain_dbfs = max_gain_dbfs - self._rng = random.Random() if rng is None else rng - - def perturb(self, data): - gain = self._rng.uniform(self._min_gain_dbfs, self._max_gain_dbfs) - data._samples = data._samples * (10. ** (gain / 20.)) - - - -class ShiftPerturbation(Perturbation): - def __init__(self, min_shift_ms=-5.0, max_shift_ms=5.0, rng=None): - self._min_shift_ms = min_shift_ms - self._max_shift_ms = max_shift_ms - self._rng = random.Random() if rng is None else rng - - def perturb(self, data): - shift_ms = self._rng.uniform(self._min_shift_ms, self._max_shift_ms) - if abs(shift_ms) / 1000 > data.duration: - # TODO: do something smarter than just ignore this condition - return - shift_samples = int(shift_ms * data.sample_rate // 1000) - # print("DEBUG: shift:", shift_samples) - if shift_samples < 0: - data._samples[-shift_samples:] = data._samples[:shift_samples] - data._samples[:-shift_samples] = 0 - elif shift_samples > 0: - data._samples[:-shift_samples] = data._samples[shift_samples:] - data._samples[-shift_samples:] = 0 - - -perturbation_types = { - "speed": SpeedPerturbation, - "gain": GainPerturbation, - "shift": ShiftPerturbation, -} - - -class AudioAugmentor(object): - def __init__(self, perturbations=None, rng=None): - self._rng = random.Random() if rng is None else rng - self._pipeline = perturbations if perturbations is not None else [] - - def perturb(self, segment): - for (prob, p) in self._pipeline: - if self._rng.random() < prob: - p.perturb(segment) - return - - def max_augmentation_length(self, length): - newlen = length - for (prob, p) in self._pipeline: - newlen = p.max_augmentation_length(newlen) - return newlen - - @classmethod - def from_config(cls, config): - ptbs = [] - for p in config: - if p['aug_type'] not in perturbation_types: - print(p['aug_type'], "perturbation not known. Skipping.") - continue - perturbation = perturbation_types[p['aug_type']] - ptbs.append((p['prob'], perturbation(**p['cfg']))) - return cls(perturbations=ptbs) diff --git a/PyTorch/SpeechRecognition/Jasper/parts/text/__init__.py b/PyTorch/SpeechRecognition/Jasper/parts/text/__init__.py deleted file mode 100644 index da9e021c..00000000 --- a/PyTorch/SpeechRecognition/Jasper/parts/text/__init__.py +++ /dev/null @@ -1,12 +0,0 @@ -# Copyright (c) 2017 Keith Ito -""" from https://github.com/keithito/tacotron """ -import re -from . import cleaners - -def _clean_text(text, cleaner_names, *args): - for name in cleaner_names: - cleaner = getattr(cleaners, name) - if not cleaner: - raise Exception('Unknown cleaner: %s' % name) - text = cleaner(text, *args) - return text diff --git a/PyTorch/SpeechRecognition/Jasper/platform/DGX1-16GB_Jasper_AMP_8GPU.sh b/PyTorch/SpeechRecognition/Jasper/platform/DGX1-16GB_Jasper_AMP_8GPU.sh index 4ac61b67..57bcd4c5 100644 --- a/PyTorch/SpeechRecognition/Jasper/platform/DGX1-16GB_Jasper_AMP_8GPU.sh +++ b/PyTorch/SpeechRecognition/Jasper/platform/DGX1-16GB_Jasper_AMP_8GPU.sh @@ -1,3 +1,3 @@ #!/bin/bash -NUM_GPUS=8 AMP=true BATCH_SIZE=64 GRADIENT_ACCUMULATION_STEPS=2 bash scripts/train.sh "$@" +NUM_GPUS=8 AMP=true BATCH_SIZE=64 GRADIENT_ACCUMULATION_STEPS=4 bash scripts/train.sh "$@" diff --git a/PyTorch/SpeechRecognition/Jasper/requirements.txt b/PyTorch/SpeechRecognition/Jasper/requirements.txt index 87e8b09c..d0395d62 100755 --- a/PyTorch/SpeechRecognition/Jasper/requirements.txt +++ b/PyTorch/SpeechRecognition/Jasper/requirements.txt @@ -1,9 +1,10 @@ -pandas==0.24.2 -tqdm==4.31.1 ascii-graph==1.5.1 -wrapt==1.10.11 -librosa -toml -soundfile ipdb -sox +librosa==0.8.0 +pandas==1.1.4 +pycuda==2020.1 +pyyaml +soundfile +sox==1.4.1 +tqdm==4.53.0 +wrapt==1.10.11 diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/docker/launch.sh b/PyTorch/SpeechRecognition/Jasper/scripts/docker/launch.sh index 0ec7b990..8f9cb884 100755 --- a/PyTorch/SpeechRecognition/Jasper/scripts/docker/launch.sh +++ b/PyTorch/SpeechRecognition/Jasper/scripts/docker/launch.sh @@ -1,26 +1,30 @@ #!/bin/bash + SCRIPT_DIR=$(cd $(dirname $0); pwd) -JASPER_REPO=${JASPER_REPO:-"${SCRIPT_DIR}/../.."} +: ${JASPER_REPO:="$SCRIPT_DIR/../.."} -# Launch TRT JASPER container. +: ${DATA_DIR:=${1:-"$JASPER_REPO/datasets"}} +: ${CHECKPOINT_DIR:=${2:-"$JASPER_REPO/checkpoints"}} +: ${OUTPUT_DIR:=${3:-"$JASPER_REPO/results"}} +: ${SCRIPT:=${4:-}} -DATA_DIR=${1:-${DATA_DIR-"/datasets"}} -CHECKPOINT_DIR=${2:-${CHECKPOINT_DIR:-"/checkpoints"}} -RESULT_DIR=${3:-${RESULT_DIR:-"/results"}} -PROGRAM_PATH=${PROGRAM_PATH} +mkdir -p $DATA_DIR +mkdir -p $CHECKPOINT_DIR +mkdir -p $OUTPUT_DIR MOUNTS="" MOUNTS+=" -v $DATA_DIR:/datasets" MOUNTS+=" -v $CHECKPOINT_DIR:/checkpoints" -MOUNTS+=" -v $RESULT_DIR:/results" -MOUNTS+=" -v ${JASPER_REPO}:/jasper" +MOUNTS+=" -v $OUTPUT_DIR:/results" +MOUNTS+=" -v $JASPER_REPO:/workspace/jasper" echo $MOUNTS -nvidia-docker run -it --rm \ - --runtime=nvidia \ +docker run -it --rm --gpus all \ + --env PYTHONDONTWRITEBYTECODE=1 \ --shm-size=4g \ --ulimit memlock=-1 \ --ulimit stack=67108864 \ - ${MOUNTS} \ - ${EXTRA_JASPER_ENV} \ - jasper:latest bash $PROGRAM_PATH + $MOUNTS \ + $EXTRA_JASPER_ENV \ + -w /workspace/jasper \ + jasper:latest bash $SCRIPT diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/evaluation.sh b/PyTorch/SpeechRecognition/Jasper/scripts/evaluation.sh index 6c5790dd..08009e51 100755 --- a/PyTorch/SpeechRecognition/Jasper/scripts/evaluation.sh +++ b/PyTorch/SpeechRecognition/Jasper/scripts/evaluation.sh @@ -14,58 +14,9 @@ # See the License for the specific language governing permissions and # limitations under the License. +set -a -echo "NVIDIA container build: ${NVIDIA_BUILD_ID}" +: ${PREDICTION_FILE:=} +: ${DATASET:="test-other"} -DATA_DIR=${1:-${DATA_DIR:-"/datasets/LibriSpeech"}} -DATASET=${2:-${DATASET:-"dev-clean"}} -MODEL_CONFIG=${3:-${MODEL_CONFIG:-"configs/jasper10x5dr_sp_offline_specaugment.toml"}} -RESULT_DIR=${4:-${RESULT_DIR:-"/results"}} -CHECKPOINT=${5:-${CHECKPOINT:-"/checkpoints/jasper_fp16.pt"}} -CREATE_LOGFILE=${6:-${CREATE_LOGFILE:-"true"}} -CUDNN_BENCHMARK=${7:-${CUDNN_BENCHMARK:-"false"}} -NUM_GPUS=${8:-${NUM_GPUS:-1}} -AMP=${9:-${AMP:-"false"}} -NUM_STEPS=${10:-${NUM_STEPS:-"-1"}} -SEED=${11:-${SEED:-0}} -BATCH_SIZE=${12:-${BATCH_SIZE:-64}} - -mkdir -p "$RESULT_DIR" - -CMD=" inference.py " -CMD+=" --batch_size $BATCH_SIZE " -CMD+=" --dataset_dir $DATA_DIR " -CMD+=" --val_manifest $DATA_DIR/librispeech-${DATASET}-wav.json " -CMD+=" --model_toml $MODEL_CONFIG " -CMD+=" --seed $SEED " -CMD+=" --ckpt $CHECKPOINT " -[ "$AMP" == "true" ] && \ -CMD+=" --amp" -[ "$NUM_STEPS" -gt 0 ] && \ -CMD+=" --steps $NUM_STEPS" -[ "$CUDNN_BENCHMARK" = "true" ] && \ -CMD+=" --cudnn" - -if [ "$CREATE_LOGFILE" = "true" ] ; then - export GBS=$(expr $BATCH_SIZE \* $NUM_GPUS) - printf -v TAG "jasper_train_benchmark_amp-%s_gbs%d" "$AMP" $GBS - DATESTAMP=`date +'%y%m%d%H%M%S'` - LOGFILE="${RESULT_DIR}/${TAG}.${DATESTAMP}.log" - printf "Logs written to %s\n" "$LOGFILE" -fi - -if [ "$NUM_GPUS" -gt 1 ] ; then - CMD="python3 -m torch.distributed.launch --nproc_per_node=$NUM_GPUS $CMD" -else - CMD="python3 $CMD" -fi - -set -x -if [ -z "$LOGFILE" ] ; then - $CMD -else - ( - $CMD - ) |& tee "$LOGFILE" -fi -set +x +bash ./scripts/inference.sh "$@" diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/inference.sh b/PyTorch/SpeechRecognition/Jasper/scripts/inference.sh index ac9fcb94..b592650d 100755 --- a/PyTorch/SpeechRecognition/Jasper/scripts/inference.sh +++ b/PyTorch/SpeechRecognition/Jasper/scripts/inference.sh @@ -14,66 +14,48 @@ # See the License for the specific language governing permissions and # limitations under the License. +: ${DATA_DIR:=${1:-"/datasets/LibriSpeech"}} +: ${MODEL_CONFIG:=${2:-"configs/jasper10x5dr_speedp-online_speca.yaml"}} +: ${OUTPUT_DIR:=${3:-"/results"}} +: ${CHECKPOINT:=${4:-"/checkpoints/jasper_fp16.pt"}} +: ${DATASET:="test-other"} +: ${LOG_FILE:=""} +: ${CUDNN_BENCHMARK:=false} +: ${MAX_DURATION:=""} +: ${PAD_TO_MAX_DURATION:=false} +: ${NUM_GPUS:=1} +: ${NUM_STEPS:=0} +: ${NUM_WARMUP_STEPS:=0} +: ${AMP:=false} +: ${BATCH_SIZE:=64} +: ${EMA:=true} +: ${SEED:=0} +: ${DALI_DEVICE:="gpu"} +: ${CPU:=false} +: ${LOGITS_FILE:=} +: ${PREDICTION_FILE:="${OUTPUT_DIR}/${DATASET}.predictions"} -echo "NVIDIA container build: ${NVIDIA_BUILD_ID}" +mkdir -p "$OUTPUT_DIR" -DATA_DIR=${1:-${DATA_DIR:-"/datasets/LibriSpeech"}} -DATASET=${2:-${DATASET:-"dev-clean"}} -MODEL_CONFIG=${3:-${MODEL_CONFIG:-"configs/jasper10x5dr_sp_offline_specaugment.toml"}} -RESULT_DIR=${4:-${RESULT_DIR:-"/results"}} -CHECKPOINT=${5:-${CHECKPOINT:-"/checkpoints/jasper_fp16.pt"}} -CREATE_LOGFILE=${6:-${CREATE_LOGFILE:-"true"}} -CUDNN_BENCHMARK=${7:-${CUDNN_BENCHMARK:-"false"}} -AMP=${8:-${AMP:-"false"}} -NUM_STEPS=${9:-${NUM_STEPS:-"-1"}} -SEED=${10:-${SEED:-0}} -BATCH_SIZE=${11:-${BATCH_SIZE:-64}} -LOGITS_FILE=${12:-${LOGITS_FILE:-""}} -PREDICTION_FILE=${13:-${PREDICTION_FILE:-"${RESULT_DIR}/${DATASET}.predictions"}} -CPU=${14:-${CPU:-"false"}} -EMA=${14:-${EMA:-"false"}} +ARGS="--dataset_dir=$DATA_DIR" +ARGS+=" --val_manifest=$DATA_DIR/librispeech-${DATASET}-wav.json" +ARGS+=" --model_config=$MODEL_CONFIG" +ARGS+=" --output_dir=$OUTPUT_DIR" +ARGS+=" --batch_size=$BATCH_SIZE" +ARGS+=" --seed=$SEED" +ARGS+=" --dali_device=$DALI_DEVICE" +ARGS+=" --steps $NUM_STEPS" +ARGS+=" --warmup_steps $NUM_WARMUP_STEPS" -mkdir -p "$RESULT_DIR" +[ "$AMP" = true ] && ARGS+=" --amp" +[ "$EMA" = true ] && ARGS+=" --ema" +[ "$CUDNN_BENCHMARK" = true ] && ARGS+=" --cudnn_benchmark" +[ -n "$CHECKPOINT" ] && ARGS+=" --ckpt=${CHECKPOINT}" +[ -n "$LOG_FILE" ] && ARGS+=" --log_file $LOG_FILE" +[ -n "$PREDICTION_FILE" ] && ARGS+=" --save_prediction $PREDICTION_FILE" +[ -n "$LOGITS_FILE" ] && ARGS+=" --logits_save_to $LOGITS_FILE" +[ "$CPU" == "true" ] && ARGS+=" --cpu" +[ -n "$MAX_DURATION" ] && ARGS+=" --max_duration $MAX_DURATION" +[ "$PAD_TO_MAX_DURATION" = true ] && ARGS+=" --pad_to_max_duration" -CMD="python inference.py " -CMD+=" --batch_size $BATCH_SIZE " -CMD+=" --dataset_dir $DATA_DIR " -CMD+=" --val_manifest $DATA_DIR/librispeech-${DATASET}-wav.json " -CMD+=" --model_toml $MODEL_CONFIG " -CMD+=" --seed $SEED " -[ "$NUM_STEPS" -gt 0 ] && \ -CMD+=" --steps $NUM_STEPS" -[ "$CUDNN_BENCHMARK" = "true" ] && \ -CMD+=" --cudnn" -[ "$AMP" == "true" ] && \ -CMD+=" --amp" -[ "$CPU" == "true" ] && \ -CMD+=" --cpu" -[ "$EMA" == "true" ] && \ -CMD+=" --ema" -[ -n "$CHECKPOINT" ] && \ -CMD+=" --ckpt=${CHECKPOINT}" -[ -n "$PREDICTION_FILE" ] && \ -CMD+=" --save_prediction $PREDICTION_FILE" -[ -n "$LOGITS_FILE" ] && \ -CMD+=" --logits_save_to $LOGITS_FILE" - -if [ "$CREATE_LOGFILE" = "true" ] ; then - export GBS=$(expr $BATCH_SIZE) - printf -v TAG "jasper_train_benchmark_amp-%s_gbs%d" "$AMP" $GBS - DATESTAMP=`date +'%y%m%d%H%M%S'` - LOGFILE="${RESULT_DIR}/${TAG}.${DATESTAMP}.log" - printf "Logs written to %s\n" "$LOGFILE" -fi - -set -x -if [ -z "$LOGFILE" ] ; then - $CMD -else - ( - $CMD - ) |& tee "$LOGFILE" -fi -set +x -[ -n "$PREDICTION_FILE" ] && echo "PREDICTION_FILE: ${PREDICTION_FILE}" -[ -n "$LOGITS_FILE" ] && echo "LOGITS_FILE: ${LOGITS_FILE}" +python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS inference.py $ARGS diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/inference_benchmark.sh b/PyTorch/SpeechRecognition/Jasper/scripts/inference_benchmark.sh index 82d5ead5..39ea23dc 100755 --- a/PyTorch/SpeechRecognition/Jasper/scripts/inference_benchmark.sh +++ b/PyTorch/SpeechRecognition/Jasper/scripts/inference_benchmark.sh @@ -14,55 +14,24 @@ # See the License for the specific language governing permissions and # limitations under the License. +set -a -echo "NVIDIA container build: ${NVIDIA_BUILD_ID}" +: ${OUTPUT_DIR:=${3:-"/results"}} +: ${CUDNN_BENCHMARK:=true} +: ${PAD_TO_MAX_DURATION:=true} +: ${NUM_WARMUP_STEPS:=10} +: ${NUM_STEPS:=500} -DATA_DIR=${1:-${DATA_DIR:-"/datasets/LibriSpeech"}} -DATASET=${2:-${DATASET:-"dev-clean"}} -MODEL_CONFIG=${3:-${MODEL_CONFIG:-"configs/jasper10x5dr_sp_offline_specaugment.toml"}} -RESULT_DIR=${4:-${RESULT_DIR:-"/results"}} -CHECKPOINT=${5:-${CHECKPOINT:-"/checkpoints/jasper_fp16.pt"}} -CREATE_LOGFILE=${6:-${CREATE_LOGFILE:-"true"}} -CUDNN_BENCHMARK=${7:-${CUDNN_BENCHMARK:-"true"}} -AMP=${8:-${AMP:-"false"}} -NUM_STEPS=${9:-${NUM_STEPS:-"-1"}} -MAX_DURATION=${10:-${MAX_DURATION:-"36"}} -SEED=${11:-${SEED:-0}} -BATCH_SIZE=${12:-${BATCH_SIZE:-64}} +: ${AMP:=false} +: ${DALI_DEVICE:="cpu"} +: ${BATCH_SIZE_SEQ:="1 2 4 8 16"} +: ${MAX_DURATION_SEQ:="2 7 16.7"} -mkdir -p "$RESULT_DIR" +for MAX_DURATION in $MAX_DURATION_SEQ; do + for BATCH_SIZE in $BATCH_SIZE_SEQ; do -CMD=" python inference_benchmark.py" -CMD+=" --batch_size=$BATCH_SIZE" -CMD+=" --model_toml=$MODEL_CONFIG" -CMD+=" --seed=$SEED" -CMD+=" --dataset_dir=$DATA_DIR" -CMD+=" --val_manifest $DATA_DIR/librispeech-${DATASET}-wav.json " -CMD+=" --ckpt=$CHECKPOINT" -CMD+=" --max_duration=$MAX_DURATION" -CMD+=" --pad_to=-1" -[ "$AMP" == "true" ] && \ -CMD+=" --amp" -[ "$NUM_STEPS" -gt 0 ] && \ -CMD+=" --steps $NUM_STEPS" -[ "$CUDNN_BENCHMARK" = "true" ] && \ -CMD+=" --cudnn" + LOG_FILE="$OUTPUT_DIR/perf-infer_dali-${DALI_DEVICE}_amp-${AMP}_dur${MAX_DURATION}_bs${BATCH_SIZE}.json" + bash ./scripts/inference.sh "$@" -if [ "$CREATE_LOGFILE" = "true" ] ; then - export GBS=$(expr $BATCH_SIZE ) - printf -v TAG "jasper_train_benchmark_amp-%s_gbs%d" "$AMP" $GBS - DATESTAMP=`date +'%y%m%d%H%M%S'` - LOGFILE="${RESULT_DIR}/${TAG}.${DATESTAMP}.log" - printf "Logs written to %s\n" "$LOGFILE" -fi - -set -x -if [ -z "$LOGFILE" ] ; then - $CMD -else - ( - $CMD - ) |& tee "$LOGFILE" - grep 'latency' "$LOGFILE" -fi -set +x + done +done diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/inference_benchmark_cpu.sh b/PyTorch/SpeechRecognition/Jasper/scripts/inference_benchmark_cpu.sh deleted file mode 100755 index 59810874..00000000 --- a/PyTorch/SpeechRecognition/Jasper/scripts/inference_benchmark_cpu.sh +++ /dev/null @@ -1,66 +0,0 @@ -#!/bin/bash - -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - - -echo "NVIDIA container build: ${NVIDIA_BUILD_ID}" - -CUDA_VISIBLE_DEVICES="" -DATA_DIR=${1:-${DATA_DIR:-"/datasets/LibriSpeech"}} -DATASET=${2:-${DATASET:-"dev-clean"}} -MODEL_CONFIG=${3:-${MODEL_CONFIG:-"configs/jasper10x5dr_sp_offline_specaugment.toml"}} -RESULT_DIR=${4:-${RESULT_DIR:-"/results"}} -CHECKPOINT=${5:-${CHECKPOINT:-"/checkpoints/jasper_fp16.pt"}} -CREATE_LOGFILE=${6:-${CREATE_LOGFILE:-"true"}} -NUM_STEPS=${7:-${NUM_STEPS:-"-1"}} -MAX_DURATION=${8:-${MAX_DURATION:-"36"}} -SEED=${9:-${SEED:-0}} -BATCH_SIZE=${10:-${BATCH_SIZE:-32}} -SAMPLE_AUDIO=${11:-${SAMPLE_AUDIO:-"/datasets/LibriSpeech/dev-clean-wav/1272/128104/1272-128104-0000.wav"}} - -mkdir -p "$RESULT_DIR" - -CMD=" python inference_benchmark.py" -CMD+=" --cpu" -CMD+=" --batch_size=$BATCH_SIZE" -CMD+=" --model_toml=$MODEL_CONFIG" -CMD+=" --seed=$SEED" -CMD+=" --dataset_dir=$DATA_DIR" -CMD+=" --val_manifest $DATA_DIR/librispeech-${DATASET}-wav.json " -CMD+=" --ckpt=$CHECKPOINT" -CMD+=" --max_duration=$MAX_DURATION" -CMD+=" --pad_to=-1" -CMD+=" --sample_audio=$SAMPLE_AUDIO" -[ "$NUM_STEPS" -gt 0 ] && \ -CMD+=" --steps $NUM_STEPS" - -if [ "$CREATE_LOGFILE" = "true" ] ; then - export GBS=$(expr $BATCH_SIZE ) - printf -v TAG "jasper_train_benchmark_amp-%s_gbs%d" "$AMP" $GBS - DATESTAMP=`date +'%y%m%d%H%M%S'` - LOGFILE="${RESULT_DIR}/${TAG}.${DATESTAMP}.log" - printf "Logs written to %s\n" "$LOGFILE" -fi - -set -x -if [ -z "$LOGFILE" ] ; then - $CMD -else - ( - $CMD - ) |& tee "$LOGFILE" - grep 'latency' "$LOGFILE" -fi -set +x diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/preprocess_librispeech.sh b/PyTorch/SpeechRecognition/Jasper/scripts/preprocess_librispeech.sh index 7cfe5cc6..9e5dc83a 100755 --- a/PyTorch/SpeechRecognition/Jasper/scripts/preprocess_librispeech.sh +++ b/PyTorch/SpeechRecognition/Jasper/scripts/preprocess_librispeech.sh @@ -14,21 +14,24 @@ #!/usr/bin/env bash +SPEEDS=$1 +[ -n "$SPEEDS" ] && SPEED_FLAG="--speed $SPEEDS" + python ./utils/convert_librispeech.py \ --input_dir /datasets/LibriSpeech/train-clean-100 \ --dest_dir /datasets/LibriSpeech/train-clean-100-wav \ --output_json /datasets/LibriSpeech/librispeech-train-clean-100-wav.json \ - --speed 0.9 1.1 + $SPEED_FLAG python ./utils/convert_librispeech.py \ --input_dir /datasets/LibriSpeech/train-clean-360 \ --dest_dir /datasets/LibriSpeech/train-clean-360-wav \ --output_json /datasets/LibriSpeech/librispeech-train-clean-360-wav.json \ - --speed 0.9 1.1 + $SPEED_FLAG python ./utils/convert_librispeech.py \ --input_dir /datasets/LibriSpeech/train-other-500 \ --dest_dir /datasets/LibriSpeech/train-other-500-wav \ --output_json /datasets/LibriSpeech/librispeech-train-other-500-wav.json \ - --speed 0.9 1.1 + $SPEED_FLAG python ./utils/convert_librispeech.py \ diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/train.sh b/PyTorch/SpeechRecognition/Jasper/scripts/train.sh index 70079080..3c6aa41e 100755 --- a/PyTorch/SpeechRecognition/Jasper/scripts/train.sh +++ b/PyTorch/SpeechRecognition/Jasper/scripts/train.sh @@ -14,70 +14,74 @@ # See the License for the specific language governing permissions and # limitations under the License. +export OMP_NUM_THREADS=1 -echo "NVIDIA container build: ${NVIDIA_BUILD_ID}" +: ${DATA_DIR:=${1:-"/datasets/LibriSpeech"}} +: ${MODEL_CONFIG:=${2:-"configs/jasper10x5dr_speedp-online_speca.yaml"}} +: ${OUTPUT_DIR:=${3:-"/results"}} +: ${CHECKPOINT:=${4:-}} +: ${RESUME:=true} +: ${CUDNN_BENCHMARK:=true} +: ${NUM_GPUS:=8} +: ${AMP:=false} +: ${BATCH_SIZE:=64} +: ${GRAD_ACCUMULATION_STEPS:=2} +: ${LEARNING_RATE:=0.01} +: ${MIN_LEARNING_RATE:=0.00001} +: ${LR_POLICY:=exponential} +: ${LR_EXP_GAMMA:=0.981} +: ${EMA:=0.999} +: ${SEED:=0} +: ${EPOCHS:=440} +: ${WARMUP_EPOCHS:=2} +: ${HOLD_EPOCHS:=140} +: ${SAVE_FREQUENCY:=10} +: ${EPOCHS_THIS_JOB:=0} +: ${DALI_DEVICE:="gpu"} +: ${PAD_TO_MAX_DURATION:=false} +: ${EVAL_FREQUENCY:=544} +: ${PREDICTION_FREQUENCY:=544} +: ${TRAIN_MANIFESTS:="$DATA_DIR/librispeech-train-clean-100-wav.json \ + $DATA_DIR/librispeech-train-clean-360-wav.json \ + $DATA_DIR/librispeech-train-other-500-wav.json"} +: ${VAL_MANIFESTS:="$DATA_DIR/librispeech-dev-clean-wav.json"} -DATA_DIR=${1:-${DATA_DIR:-"/datasets/LibriSpeech"}} -MODEL_CONFIG=${2:-${MODEL_CONFIG:-"configs/jasper10x5dr_sp_offline_specaugment.toml"}} -RESULT_DIR=${3:-${RESULT_DIR:-"/results"}} -CHECKPOINT=${4:-${CHECKPOINT:-""}} -CREATE_LOGFILE=${5:-${CREATE_LOGFILE:-"true"}} -CUDNN_BENCHMARK=${6:-${CUDNN_BENCHMARK:-"true"}} -NUM_GPUS=${7:-${NUM_GPUS:-8}} -AMP=${8:-${AMP:-"false"}} -EPOCHS=${9:-${EPOCHS:-400}} -SEED=${10:-${SEED:-6}} -BATCH_SIZE=${11:-${BATCH_SIZE:-64}} -LEARNING_RATE=${12:-${LEARNING_RATE:-"0.015"}} -GRADIENT_ACCUMULATION_STEPS=${13:-${GRADIENT_ACCUMULATION_STEPS:-2}} -EMA=${EMA:-0.999} -SAVE_FREQUENCY=${SAVE_FREQUENCY:-10} +mkdir -p "$OUTPUT_DIR" -mkdir -p "$RESULT_DIR" +ARGS="--dataset_dir=$DATA_DIR" +ARGS+=" --val_manifests $VAL_MANIFESTS" +ARGS+=" --train_manifests $TRAIN_MANIFESTS" +ARGS+=" --model_config=$MODEL_CONFIG" +ARGS+=" --output_dir=$OUTPUT_DIR" +ARGS+=" --lr=$LEARNING_RATE" +ARGS+=" --batch_size=$BATCH_SIZE" +ARGS+=" --min_lr=$MIN_LEARNING_RATE" +ARGS+=" --lr_policy=$LR_POLICY" +ARGS+=" --lr_exp_gamma=$LR_EXP_GAMMA" +ARGS+=" --epochs=$EPOCHS" +ARGS+=" --warmup_epochs=$WARMUP_EPOCHS" +ARGS+=" --hold_epochs=$HOLD_EPOCHS" +ARGS+=" --epochs_this_job=$EPOCHS_THIS_JOB" +ARGS+=" --ema=$EMA" +ARGS+=" --seed=$SEED" +ARGS+=" --optimizer=novograd" +ARGS+=" --weight_decay=1e-3" +ARGS+=" --save_frequency=$SAVE_FREQUENCY" +ARGS+=" --keep_milestones 100 200 300 400" +ARGS+=" --save_best_from=380" +ARGS+=" --log_frequency=1" +ARGS+=" --eval_frequency=$EVAL_FREQUENCY" +ARGS+=" --prediction_frequency=$PREDICTION_FREQUENCY" +ARGS+=" --grad_accumulation_steps=$GRAD_ACCUMULATION_STEPS " +ARGS+=" --dali_device=$DALI_DEVICE" -CMD="python3 -m torch.distributed.launch --nproc_per_node=$NUM_GPUS" -CMD+=" train.py" -CMD+=" --batch_size=$BATCH_SIZE" -CMD+=" --num_epochs=$EPOCHS" -CMD+=" --output_dir=$RESULT_DIR" -CMD+=" --model_toml=$MODEL_CONFIG" -CMD+=" --lr=$LEARNING_RATE" -CMD+=" --ema=$EMA" -CMD+=" --seed=$SEED" -CMD+=" --optimizer=novograd" -CMD+=" --dataset_dir=$DATA_DIR" -CMD+=" --val_manifest=$DATA_DIR/librispeech-dev-clean-wav.json" -CMD+=" --train_manifest=$DATA_DIR/librispeech-train-clean-100-wav.json" -CMD+=",$DATA_DIR/librispeech-train-clean-360-wav.json" -CMD+=",$DATA_DIR/librispeech-train-other-500-wav.json" -CMD+=" --weight_decay=1e-3" -CMD+=" --save_freq=$SAVE_FREQUENCY" -CMD+=" --eval_freq=100" -CMD+=" --train_freq=1" -CMD+=" --lr_decay" -CMD+=" --gradient_accumulation_steps=$GRADIENT_ACCUMULATION_STEPS " +[ "$AMP" = true ] && ARGS+=" --amp" +[ "$RESUME" = true ] && ARGS+=" --resume" +[ "$CUDNN_BENCHMARK" = true ] && ARGS+=" --cudnn_benchmark" +[ "$PAD_TO_MAX_DURATION" = true ] && ARGS+=" --pad_to_max_duration" +[ -n "$CHECKPOINT" ] && ARGS+=" --ckpt=$CHECKPOINT" +[ -n "$LOG_FILE" ] && ARGS+=" --log_file $LOG_FILE" +[ -n "$PRE_ALLOCATE" ] && ARGS+=" --pre_allocate_range $PRE_ALLOCATE" -[ "$AMP" == "true" ] && \ -CMD+=" --amp" -[ "$CUDNN_BENCHMARK" = "true" ] && \ -CMD+=" --cudnn" -[ -n "$CHECKPOINT" ] && \ -CMD+=" --ckpt=${CHECKPOINT}" - -if [ "$CREATE_LOGFILE" = "true" ] ; then - export GBS=$(expr $BATCH_SIZE \* $NUM_GPUS) - printf -v TAG "jasper_train_benchmark_amp-%s_gbs%d" "$AMP" $GBS - DATESTAMP=`date +'%y%m%d%H%M%S'` - LOGFILE=$RESULT_DIR/$TAG.$DATESTAMP.log - printf "Logs written to %s\n" "$LOGFILE" -fi - -set -x -if [ -z "$LOGFILE" ] ; then - $CMD -else - ( - $CMD - ) |& tee $LOGFILE -fi -set +x +DISTRIBUTED="-m torch.distributed.launch --nproc_per_node=$NUM_GPUS" +python $DISTRIBUTED train.py $ARGS diff --git a/PyTorch/SpeechRecognition/Jasper/scripts/train_benchmark.sh b/PyTorch/SpeechRecognition/Jasper/scripts/train_benchmark.sh index c74f704b..f70760fe 100755 --- a/PyTorch/SpeechRecognition/Jasper/scripts/train_benchmark.sh +++ b/PyTorch/SpeechRecognition/Jasper/scripts/train_benchmark.sh @@ -14,100 +14,36 @@ # See the License for the specific language governing permissions and # limitations under the License. +set -a -echo "NVIDIA container build: ${NVIDIA_BUILD_ID}" +# measure on speed perturbed data, but so slightly that fbank length remains the same +# with pad_to_max_duration, this reduces cuDNN benchmak's burn-in period to a single step +: ${DATA_DIR:=${1:-"/datasets/LibriSpeech"}} +: ${OUTPUT_DIR:=${3:-"/results"}} +: ${TRAIN_MANIFESTS:="$DATA_DIR/librispeech-train-clean-100-wav.json"} -SCRIPT_DIR=$(cd $(dirname $0); pwd) -PROJECT_DIR=${SCRIPT_DIR}/.. +# run for a number of epochs, but don't finalize the training +: ${EPOCHS_THIS_JOB:=2} +: ${EPOCHS:=100000} +: ${RESUME:=false} +: ${SAVE_FREQUENCY:=100000} +: ${EVAL_FREQUENCY:=100000} +: ${GRAD_ACCUMULATION_STEPS:=1} -DATA_DIR=${1:-${DATA_DIR:-"/datasets/LibriSpeech"}} -MODEL_CONFIG=${2:-${MODEL_CONFIG:-"configs/jasper10x5dr_sp_offline_specaugment.toml"}} -RESULT_DIR=${3:-${RESULT_DIR:-"/results"}} -CREATE_LOGFILE=${4:-${CREATE_LOGFILE:-"true"}} -CUDNN_BENCHMARK=${5:-${CUDNN_BENCHMARK:-"true"}} -NUM_GPUS=${6:-${NUM_GPUS:-8}} -AMP=${7:-${AMP:-"false"}} -NUM_STEPS=${8:-${NUM_STEPS:-"-1"}} -MAX_DURATION=${9:-${MAX_DURATION:-16.7}} -SEED=${10:-${SEED:-0}} -BATCH_SIZE=${11:-${BATCH_SIZE:-32}} -LEARNING_RATE=${12:-${LEARNING_RATE:-"0.015"}} -GRADIENT_ACCUMULATION_STEPS=${13:-${GRADIENT_ACCUMULATION_STEPS:-1}} -PRINT_FREQUENCY=${14:-${PRINT_FREQUENCY:-1}} -USE_PROFILER=${USE_PROFILER:-"false"} +: ${AMP:=false} +: ${EMA:=0} +: ${DALI_DEVICE:="gpu"} +: ${NUM_GPUS_SEQ:="1 4 8"} +: ${BATCH_SIZE_SEQ:="32"} +# A probable range of batch lengths for LibriSpeech +# with BS=64 and continuous speed perturbation (0.85, 1.15) +: ${PRE_ALLOCATE:="1408 1920"} -mkdir -p "$RESULT_DIR" +for NUM_GPUS in $NUM_GPUS_SEQ; do + for BATCH_SIZE in $BATCH_SIZE_SEQ; do -[ "${USE_PROFILER}" = "true" ] && PYTHON_ARGS="-m cProfile -s cumtime" + LOG_FILE="$OUTPUT_DIR/perf-train_dali-${DALI_DEVICE}_amp-${AMP}_ngpus${NUM_GPUS}_bs${BATCH_SIZE}.json" + bash ./scripts/train.sh "$@" -CMD="${PYTHON_ARGS} ${PROJECT_DIR}/train.py" -CMD+=" --batch_size=$BATCH_SIZE" -CMD+=" --num_epochs=400" -CMD+=" --output_dir=$RESULT_DIR" -CMD+=" --model_toml=$MODEL_CONFIG" -CMD+=" --lr=$LEARNING_RATE" -CMD+=" --seed=$SEED" -CMD+=" --optimizer=novograd" -CMD+=" --gradient_accumulation_steps=$GRADIENT_ACCUMULATION_STEPS" -CMD+=" --dataset_dir=$DATA_DIR" -CMD+=" --val_manifest=$DATA_DIR/librispeech-dev-clean-wav.json" -CMD+=" --train_manifest=$DATA_DIR/librispeech-train-clean-100-wav.json," -CMD+="$DATA_DIR/librispeech-train-clean-360-wav.json," -CMD+="$DATA_DIR/librispeech-train-other-500-wav.json" -CMD+=" --weight_decay=1e-3" -CMD+=" --save_freq=100000" -CMD+=" --eval_freq=100000" -CMD+=" --max_duration=$MAX_DURATION" -CMD+=" --pad_to_max" -CMD+=" --train_freq=$PRINT_FREQUENCY" -CMD+=" --lr_decay " -[ "$AMP" == "true" ] && \ -CMD+=" --amp" -[ "$CUDNN_BENCHMARK" = "true" ] && \ -CMD+=" --cudnn" -[ "$NUM_STEPS" -gt 1 ] && \ -CMD+=" --num_steps=$NUM_STEPS" - -if [ "$NUM_GPUS" -gt 1 ] ; then - CMD="python3 -m torch.distributed.launch --nproc_per_node=$NUM_GPUS $CMD" -else - CMD="python3 $CMD" -fi - -if [ "$CREATE_LOGFILE" = "true" ] ; then - export GBS=$(expr $BATCH_SIZE \* $NUM_GPUS) - printf -v TAG "jasper_train_benchmark_amp-%s_gbs%d" "$AMP" $GBS - DATESTAMP=`date +'%y%m%d%H%M%S'` - LOGFILE="${RESULT_DIR}/${TAG}.${DATESTAMP}.log" - printf "Logs written to %s\n" "$LOGFILE" -fi - -if [ -z "$LOGFILE" ] ; then - - set -x - $CMD - set +x -else - - set -x - ( - $CMD - ) |& tee "$LOGFILE" - - set +x - - mean_latency=`cat "$LOGFILE" | grep 'Step time' | awk '{print $3}' | tail -n +2 | egrep -o '[0-9.]+'| awk 'BEGIN {total=0} {total+=$1} END {printf("%.2f\n",total/NR)}'` - mean_throughput=`python -c "print($BATCH_SIZE*$NUM_GPUS/${mean_latency})"` - training_wer_per_pgu=`cat "$LOGFILE" | grep 'training_batch_WER'| awk '{print $2}' | tail -n 1 | egrep -o '[0-9.]+'` - training_loss_per_pgu=`cat "$LOGFILE" | grep 'Loss@Step'| awk '{print $4}' | tail -n 1 | egrep -o '[0-9.]+'` - final_eval_wer=`cat "$LOGFILE" | grep 'Evaluation WER'| tail -n 1 | egrep -o '[0-9.]+'` - final_eval_loss=`cat "$LOGFILE" | grep 'Evaluation Loss'| tail -n 1 | egrep -o '[0-9.]+'` - - echo "max duration: $MAX_DURATION s" | tee -a "$LOGFILE" - echo "mean_latency: $mean_latency s" | tee -a "$LOGFILE" - echo "mean_throughput: $mean_throughput sequences/s" | tee -a "$LOGFILE" - echo "training_wer_per_pgu: $training_wer_per_pgu" | tee -a "$LOGFILE" - echo "training_loss_per_pgu: $training_loss_per_pgu" | tee -a "$LOGFILE" - echo "final_eval_loss: $final_eval_loss" | tee -a "$LOGFILE" - echo "final_eval_wer: $final_eval_wer" | tee -a "$LOGFILE" -fi + done +done diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/Dockerfile b/PyTorch/SpeechRecognition/Jasper/tensorrt/Dockerfile deleted file mode 100644 index 2858ff9c..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/Dockerfile +++ /dev/null @@ -1,14 +0,0 @@ -ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:20.08-py3 -FROM ${FROM_IMAGE_NAME} - -# Here's a good place to install pip reqs from JoC repo. -# At the same step, also install TRT pip reqs -WORKDIR /tmp/pipReqs -COPY requirements.txt /tmp/pipReqs/jocRequirements.txt -COPY tensorrt/requirements.txt /tmp/pipReqs/trtRequirements.txt -RUN pip install --disable-pip-version-check -U -r jocRequirements.txt -r trtRequirements.txt - - -WORKDIR /workspace/jasper -COPY . . - diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/README.md b/PyTorch/SpeechRecognition/Jasper/tensorrt/README.md deleted file mode 100644 index 88335f51..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/README.md +++ /dev/null @@ -1,300 +0,0 @@ - -# Jasper Inference For TensorRT - -This is subfolder of the Jasper for PyTorch repository, tested and maintained by NVIDIA, and provides scripts to perform high-performance inference using NVIDIA TensorRT. Jasper is a neural acoustic model for speech recognition. Its network architecture is designed to facilitate fast GPU inference. More information about Jasper and its training and be found in the [Jasper PyTorch README](../README.md). -NVIDIA TensorRT is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. -After optimizing the compute-intensive acoustic model with NVIDIA TensorRT, inference throughput increased by up to 1.8x over native PyTorch. - - - -## Table Of Contents - -- [Model overview](#model-overview) - * [Model architecture](#model-architecture) - * [TensorRT Inference pipeline](#tensorrt-inference-pipeline) - * [Version Info](#version-info) -- [Setup](#setup) - * [Requirements](#requirements) -- [Quick Start Guide](#quick-start-guide) -- [Advanced](#advanced) - * [Scripts and sample code](#scripts-and-sample-code) - * [Parameters](#parameters) - * [TensorRT Inference Benchmark Process](#tensorrt-inference-benchmark-process) - * [TensorRT Inference Process](#tensorrt-inference-process) -- [Performance](#performance) - * [Results](#results) - * [Inference performance: NVIDIA T4](#inference-performance-nvidia-t4) - - -## Model overview - -### Model architecture -By default the model configuration is Jasper 10x5 with dense residuals. A Jasper BxR model has B blocks, each consisting of R repeating sub-blocks. -Each sub-block applies the following operations in sequence: 1D-Convolution, Batch Normalization, ReLU activation, and Dropout. -In the original paper Jasper is trained with masked convolutions, which masks out the padded part of an input sequence in a batch before the 1D-Convolution. -For inference masking is not used. The reason for this is that in inference, the original mask operation does not achieve better accuracy than without the mask operation on the test and development dataset. However, no masking achieves better inference performance especially after TensorRT optimization. - - -### TensorRT Inference pipeline - -The Jasper inference pipeline consists of 3 components: data preprocessor, acoustic model and greedy decoder. The acoustic model is the most compute intensive, taking more than 90% of the entire end-to-end pipeline. The acoustic model is the only component with learnable parameters and also what differentiates Jasper from the competition. So, we focus on the acoustic model for the most part. - -For the non-TensorRT Jasper inference pipeline, all 3 components are implemented and run with native PyTorch. For the TensorRT inference pipeline, we show the speedup of running the acoustic model with TensorRT, while preprocessing and decoding are reused from the native PyTorch pipeline. - -To run a model with TensorRT, we first construct the model in PyTorch, which is then exported into an ONNX file. Finally, a TensorRT engine is constructed from the ONNX file, serialized to TensorRT engine file, and also launched to do inference. - -Note that TensorRT engine is being runtime optimized before serialization. TensorRT tries a vast set of options to find the strategy that performs best on user’s GPU - so it takes a few minutes. After the TensorRT engine file is created, it can be reused. - -### Version Info - -The following software version configuration has been tested and known to work: - -|Software|Version| -|--------|-------| -|Python|3.6.10| -|PyTorch|1.7.0a0+8deb4fe| -|TensorRT|7.1.3.4| -|CUDA|11.0.221| - -## Setup - -The following section lists the requirements in order to start inference on the Jasper model with TensorRT. - -### Requirements - -This repository contains a `Dockerfile` which extends the PyTorch 19.10-py3 NGC container and encapsulates some dependencies. Ensure you have the following components: - -* [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker) -* [PyTorch 20.08-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) -* NVIDIA [Volta](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/), [Turing](https://www.nvidia.com/en-us/geforce/turing/), or [Ampere](https://www.nvidia.com/en-us/data-center/nvidia-ampere-gpu-architecture/) based GPU -* [Pretrained Jasper Model Checkpoint](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16) - -Required Python packages are listed in `requirements.txt` and `tensorrt/requirements.txt`. These packages are automatically installed when the Docker container is built. To manually install them, run: - - -```bash -pip install -r requirements.txt -pip install -r tensorrt/requirements.txt -``` - - -## Quick Start Guide - - -Running the following scripts will build and launch the container containing all required dependencies for both TensorRT as well as native PyTorch. This is necessary for using inference with TensorRT and can also be used for data download, processing and training of the model. - -1. Clone the repository. - - ```bash - git clone https://github.com/NVIDIA/DeepLearningExamples - cd DeepLearningExamples/PyTorch/SpeechRecognition/Jasper - ``` -2. Build the Jasper PyTorch with TensorRT container: - - ```bash - bash tensorrt/scripts/docker/build.sh - ``` -3. Start an interactive session in the NGC docker container: - - ```bash - bash tensorrt/scripts/docker/launch.sh - ``` - - Alternatively, to start a script in the docker container: - - ```bash - bash tensorrt/scripts/docker/launch.sh - ``` - - The `/datasets`, `/checkpoints`, `/results` directories will be mounted as volumes and mapped to the corresponding directories ``, ``, `` on the host. **These three paths should be absolute and should already exist.** The contents of this repository will be mounted to the `/workspace/jasper` directory. Note that ``, ``, and `` directly correspond to the same arguments in `scripts/docker/launch.sh` mentioned in the [Jasper PyTorch README](../README.md). - - Briefly, `` should contain, or be prepared to contain a `LibriSpeech` sub-directory (created in [Acquiring Dataset](#acquiring-dataset)), `` should contain a PyTorch model checkpoint (`*.pt`) file obtained through training described in [Jasper PyTorch README](../README.md), and `` should be prepared to contain timing results, logs, serialized TensorRT engines, and ONNX files. - - 4. Acquiring dataset - - If LibriSpeech has already been downloaded and preprocessed as defined in the [Jasper PyTorch README](../README.md), no further steps in this subsection need to be taken. - - If LibriSpeech has not been downloaded already, note that only a subset of LibriSpeech is typically used for inference (`dev-*` and `test-*`). To acquire the inference subset of LibriSpeech run the following commands inside the container (does not require GPU): - - ```bash - bash tensorrt/scripts/download_inference_librispeech.sh - ``` - - Once the data download is complete, the following folders should exist: - - * `/datasets/LibriSpeech/` - * `dev-clean/` - * `dev-other/` - * `test-clean/` - * `test-other/` - - Next, preprocessing the data can be performed with the following command: - - ```bash - bash tensorrt/scripts/preprocess_inference_librispeech.sh - ``` - - Once the data is preprocessed, the following additional files should now exist: - * `/datasets/LibriSpeech/` - * `librispeech-dev-clean-wav.json` - * `librispeech-dev-other-wav.json` - * `librispeech-test-clean-wav.json` - * `librispeech-test-other-wav.json` - * `dev-clean-wav/` - * `dev-other-wav/` - * `test-clean-wav/` - * `test-other-wav/` - -5. Start TensorRT inference prediction - - Inside the container, use the following script to run inference with TensorRT. To learn more about the following env variables see `tensorrt/scripts/inference.sh`. - ```bash - export CHECKPOINT= - export TRT_PRECISION= - export PYTORCH_PRECISION= - export TRT_PREDICTION_PATH= - bash tensorrt/scripts/inference.sh - ``` - A pretrained model checkpoint can be downloaded from [NGC model repository](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16). - More details can be found in [Advanced](#advanced) under [Scripts and sample code](#scripts-and-sample-code), [Parameters](#parameters) and [TensorRT Inference process](#tensorrt-inference). - -6. Start TensorRT inference benchmark - - Inside the container, use the following script to run inference benchmark with TensorRT. - ```bash - export CHECKPOINT= - export NUM_STEPS= - export NUM_FRAMES= - export BATCH_SIZE= - export TRT_PRECISION= - export PYTORCH_PRECISION= - export CSV_PATH= - bash tensorrt/scripts/inference_benchmark.sh - ``` - A pretrained model checkpoint can be downloaded from the [NGC model repository](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16). - More details can be found in [Advanced](#advanced) under [Scripts and sample code](#scripts-and-sample-code), [Parameters](#parameters) and [TensorRT Inference Benchmark process](#tensorrt-inference-benchmark). - -7. Start Jupyter notebook to run inference interactively - - The Jupyter notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. - The notebook which is located at `notebooks/JasperTRT.ipynb` offers an interactive method to run the Steps 2,3,4,5. In addition, the notebook shows examples how to use TensorRT to transcribe a single audio file into text. To launch the application please follow the instructions under [../notebooks/README.md](../notebooks/README.md). - A pretrained model checkpoint can be downloaded from [NGC model repository](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16). - - -## Advanced -The following sections provide greater details on inference benchmarking with TensorRT and show inference results - -### Scripts and sample code -In the `tensorrt/` directory, the most important files are: -* `Dockerfile`: Container to run Jasper inference with TensorRT. -* `requirements.py`: Python package dependencies. Installed when building the Docker container. -* `perf.py`: Entry point for inference pipeline using TensorRT. -* `perfprocedures.py`: Contains functionality to run inference through both the PyTorch model and TensorRT Engine, taking runtime measurements of each component of the inference process for comparison. -* `trtutils.py`: Helper functions for TensorRT components of Jasper inference. -* `perfutils.py`: Helper functions for non-TensorRT components of Jasper inference. - -The `tensorrt/scripts/` directory has one-click scripts to run supported functionalities, such as: - -* `download_librispeech.sh`: Downloads LibriSpeech inference dataset. -* `preprocess_librispeech.sh`: Preprocess LibriSpeech raw data files to be ready for inference. -* `inference_benchmark.sh`: Benchmarks and compares TensorRT and PyTorch inference pipelines using the `perf.py` script. -* `inference.sh`: Runs TensorRT and PyTorch inference using the `inference_benchmark.sh` script. -* `walk_benchmark.sh`: Illustrates an example of using `tensorrt/scripts/inference_benchmark.sh`, which *walks* a variety of values for `BATCH_SIZE` and `NUM_FRAMES`. -* `docker/`: Contains the scripts for building and launching the container. - - -### Parameters - -The list of parameters available for `tensorrt/scripts/inference_benchmark.sh` is: - -``` -Required: --------- -CHECKPOINT: Model checkpoint path - -Arguments with Defaults: --------- -DATA_DIR: directory of the dataset (Default: `/datasets/Librispeech`) -DATASET: name of dataset to use (default: `dev-clean`) -RESULT_DIR: directory for results including TensorRT engines, ONNX files, logs, and CSVs (default: `/results`) -CREATE_LOGFILE: boolean that indicates whether to create log of session to be stored in `$RESULT_DIR` (default: "true") -CSV_PATH: file to store CSV results (default: `/results/res.csv`) -TRT_PREDICTION_PATH: file to store inference prediction results generated with TensorRT (default: `none`) -PYT_PREDICTION_PATH: file to store inference prediction results generated with native PyTorch (default: `none`) -VERBOSE: boolean that indicates whether to verbosely describe TensorRT engine building/deserialization and TensorRT inference (default: "false") -TRT_PRECISION: "fp32" or "fp16". Defines which precision kernels will be used for TensorRT engine (default: "fp32") -PYTORCH_PRECISION: "fp32" or "fp16". Defines which precision will be used for inference in PyTorch (default: "fp32") -NUM_STEPS: Number of inference steps. If -1 runs inference on entire dataset (default: 100) -BATCH_SIZE: data batch size (default: 64) -NUM_FRAMES: cuts/pads all pre-processed feature tensors to this length. 100 frames ~ 1 second of audio (default: 512) -FORCE_ENGINE_REBUILD: boolean that indicates whether an already-built TensorRT engine of equivalent precision, batch-size, and number of frames should not be used. Engines are specific to the GPU, library versions, TensorRT versions, and CUDA versions they were built in and cannot be used in a different environment. (default: "true") -USE_DYNAMIC_SHAPE: if 'yes' uses dynamic shapes (default: ‘yes’). Dynamic shape is always preferred since it allows to reuse engines. -``` - -The complete list of parameters available for `tensorrt/scripts/inference.sh` is the same as `tensorrt/scripts/inference_benchmark.sh` only with different default input arguments. In the following, only the parameters with different default values are listed: - -``` -TRT_PREDICTION_PATH: file to store inference prediction results generated with TensorRT (default: `/results/trt_predictions.txt`) -PYT_PREDICTION_PATH: file to store inference prediction results generated with native PyTorch (default: `/results/pyt_predictions.txtone`) -NUM_STEPS: Number of inference steps. If -1 runs inference on entire dataset (default: -1) -BATCH_SIZE: data batch size (default: 1) -NUM_FRAMES: cuts/pads all pre-processed feature tensors to this length. 100 frames ~ 1 second of audio (default: 3600) -``` - -### TensorRT Inference Benchmark process - -The inference benchmarking is performed on a single GPU by ‘tensorrt/scripts/inference_benchmark.sh’ which delegates to `tensorrt/perf.py`, which takes the following steps: - - -1. Construct Jasper acoustic model in PyTorch. - -2. Construct TensorRT Engine of Jasper acoustic model - - 1. Perform ONNX export on the PyTorch model, if its ONNX file does not already exist. - - 2. Construct TensorRT engine from ONNX export, if a saved engine file does not already exist or `FORCE_ENGINE_REBUILD` is `true`. - -3. For each batch in the dataset, run inference through both the PyTorch model and TensorRT Engine, taking runtime measurements of each component of the inference process. - -4. Compile performance and WER accuracy results in CSV format, written to `CSV_PATH` file. - -`tensorrt/perf.py` utilizes `tensorrt/trtutils.py` and `tensorrt/perfutils.py`, helper functions for TensorRT and non-TensorRT components of Jasper inference respectively. - -### TensorRT Inference process - -The inference is performed by `tensorrt/scripts/inference.sh` which delegates to `tensorrt/scripts/inference_benchmark.sh`. The script runs on a single GPU. To do inference prediction on the entire dataset `NUM_FRAMES` is set to 3600, which roughly corresponds to 36 seconds. This covers the longest sentences in both LibriSpeech dev and test dataset. By default, `BATCH_SET` is set to 1 to simulate the online inference scenario in deployment. Other batch sizes can be tried by setting a different value to this parameter. By default `TRT_PRECISION` is set to full precision and can be changed by setting `export TRT_PRECISION=fp16`. The prediction results are stored at `/results/trt_predictions.txt` and `/results/pyt_predictions.txt`. - - - -## Performance - -To benchmark the inference performance on a specific batch size and audio length refer to [Quick-Start-Guide](#quick-start-guide). To do a sweep over multiple batch sizes and audio durations run: -```bash -bash tensorrt/scripts/walk_benchmark.sh -``` -The results are obtained by running inference on LibriSpeech dev-clean dataset on a single T4 GPU using half precision with AMP. We compare the throughput of the acoustic model between TensorRT and native PyTorch. - -### Results - - - -#### Inference performance: NVIDIA T4 - -| Sequence Length (in seconds) | Batch size | TensorRT FP16 Throughput (#sequences/second) Percentiles | | | | PyTorch FP16 Throughput (#sequences/second) Percentiles | | | | TRT/PyT Speedup | -|---------------|------------|---------------------|---------|---------|---------|-----------------|---------|---------|---------|-----------------| -| | | 90% | 95% | 99% | Avg | 90% | 95% | 99% | Avg | | -|2|1|71.002|70.897|70.535|71.987|42.974|42.932|42.861|43.166|1.668| -||2|136.369|135.915|135.232|139.266|81.398|77.826|57.408|81.254|1.714| -||4|231.528|228.875|220.085|239.686|130.055|117.779|104.529|135.660|1.767| -||8|310.224|308.870|289.132|316.536|215.401|202.902|148.240|228.805|1.383| -||16|389.086|366.839|358.419|401.267|288.353|278.708|230.790|307.070|1.307| -|7|1|61.792|61.273|59.842|63.537|34.098|33.963|33.785|34.639|1.834| -||2|93.869|92.480|91.528|97.082|59.397|59.221|51.050|60.934|1.593| -||4|113.108|112.950|112.531|114.507|66.947|66.479|59.926|67.704|1.691| -||8|118.878|118.542|117.619|120.367|83.208|82.998|82.698|84.187|1.430| -||16|122.909|122.718|121.547|124.190|102.212|102.000|101.187|103.049|1.205| -|16.7|1|38.665|38.404|37.946|39.363|21.267|21.197|21.127|21.456|1.835| -||2|44.960|44.867|44.382|45.583|30.218|30.156|29.970|30.679|1.486| -||4|47.754|47.667|47.541|48.287|29.146|29.079|28.941|29.470|1.639| -||8|51.051|50.969|50.620|51.489|37.565|37.497|37.373|37.834|1.361| -||16|53.316|53.288|53.188|53.773|45.217|45.090|44.946|45.560|1.180| diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/perf.py b/PyTorch/SpeechRecognition/Jasper/tensorrt/perf.py deleted file mode 100755 index 18538935..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/perf.py +++ /dev/null @@ -1,132 +0,0 @@ -#!/usr/bin/env python3 -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -'''Constructs TensorRT engine for JASPER and evaluates inference latency''' -import argparse -import sys, os -# Get local modules in parent directory and current directory (assuming this was called from root of repository) -sys.path.append("./") -sys.path.append("./tensorrt") -import perfutils -import trtutils -import perfprocedures -from model import GreedyCTCDecoder -from helpers import __ctc_decoder_predictions_tensor -import caffe2.python.onnx.backend as c2backend -import onnxruntime as ort - -import torch -from torch import nn -from torch.nn import functional as F - - -def main(args): - print ("Getting component") - # Get shared utility across PyTorch and TRT - pyt_components, saved_onnx = perfutils.get_pytorch_components_and_onnx(args) - - print ("Getting engine") - # Get a TRT engine. See function for argument parsing logic - engine = trtutils.get_engine(args) - print ("Got engine.") - - if args.wav: - with torch.no_grad(): - audio_processor = pyt_components['audio_preprocessor'] - audio_processor.eval() - greedy_decoder = GreedyCTCDecoder() - input_wav, num_audio_samples = pyt_components['input_wav'] - features = audio_processor(input_wav, num_audio_samples) - features = perfutils.adjust_shape(features, args) - if not args.engine_path: - outputs = engine.run(None, {'FEATURES': features[0].data.cpu().numpy()}) - inference = 1.0 - t_log_probs_e = outputs[0] - t_log_probs_e=perfutils.torchify_trt_out(t_log_probs_e, t_log_probs_e.shape) - else: - with engine.create_execution_context() as context: - t_log_probs_e, copyto, inference, copyfrom= perfprocedures.do_inference(context, [features[0]]) - t_predictions_e = greedy_decoder(t_log_probs_e) - hypotheses = __ctc_decoder_predictions_tensor(t_predictions_e, labels=perfutils.get_vocab()) - print("INTERENCE TIME: {} ms".format(inference*1000.0)) - print("TRANSCRIPT: ", hypotheses) - return - - wer, preds, times = perfprocedures.compare_times_trt_pyt_exhaustive(engine, - pyt_components, - args) - - string_header, string_data = perfutils.do_csv_export(wer, times, args.batch_size, args.seq_len) - - if args.csv_path is not None: - print ("Exporting to " + args.csv_path) - with open(args.csv_path, 'a+') as f: - # See if header is there, if so, check that it matches - f.seek(0) # Read from start of file - existing_header = f.readline() - if existing_header == "": - f.write(string_header) - f.write("\n") - elif existing_header[:-1] != string_header: - raise Exception(f"Writing to existing CSV with incorrect format\nProduced:\n{string_header}\nFound:\n{existing_header}\nIf you intended to write to a new results csv, please change the csv_path argument") - f.seek(0,2) # Write to end of file - f.write(string_data) - f.write("\n") - else: - print(string_header) - print(string_data) - - if args.trt_prediction_path is not None: - with open(args.trt_prediction_path, 'w') as fp: - fp.write('\n'.join(preds['trt'])) - - if args.pyt_prediction_path is not None: - with open(args.pyt_prediction_path, 'w') as fp: - fp.write('\n'.join(preds['pyt'])) - - -def parse_args(): - parser = argparse.ArgumentParser(description="Performance test of TRT") - parser.add_argument("--engine_path", default=None, type=str, help="Path to serialized TRT engine") - parser.add_argument("--use_existing_engine", action="store_true", default=False, help="If set, will deserialize engine at --engine_path" ) - parser.add_argument("--engine_batch_size", default=16, type=int, help="Maximum batch size for constructed engine; needed when building") - parser.add_argument("--batch_size", default=16, type=int, help="Batch size for data when running inference.") - parser.add_argument("--dataset_dir", type=str, help="Root directory of dataset") - parser.add_argument("--model_toml", type=str, required=True, help="Config toml to use. A selection can be found in configs/") - parser.add_argument("--val_manifest", type=str, help="JSON manifest of dataset.") - parser.add_argument("--onnx_path", default=None, type=str, help="Path to onnx model for engine creation") - parser.add_argument("--seq_len", default=None, type=int, help="Generate an ONNX export with this fixed sequence length, and save to --onnx_path. Requires also using --onnx_path and --ckpt_path.") - parser.add_argument("--max_seq_len", default=3600, type=int, help="Max sequence length for TRT engine build. Default works with TRTIS benchmark. Set it larger than seq_len") - parser.add_argument("--ckpt_path", default=None, type=str, help="If provided, will also construct pytorch acoustic model") - parser.add_argument("--max_duration", default=None, type=float, help="Maximum possible length of audio data in seconds") - parser.add_argument("--num_steps", default=-1, type=int, help="Number of inference steps to run") - parser.add_argument("--trt_fp16", action="store_true", default=False, help="If set, will allow TRT engine builder to select fp16 kernels as well as fp32") - parser.add_argument("--pyt_fp16", action="store_true", default=False, help="If set, will construct pytorch model with fp16 weights") - parser.add_argument("--make_onnx", action="store_true", default=False, help="If set, will create an ONNX model and store it at the path specified by --onnx_path") - parser.add_argument("--csv_path", type=str, default=None, help="File to append csv info about inference time") - parser.add_argument("--trt_prediction_path", type=str, default=None, help="File to write predictions inferred with tensorrt") - parser.add_argument("--pyt_prediction_path", type=str, default=None, help="File to write predictions inferred with pytorch") - parser.add_argument("--verbose", action="store_true", default=False, help="If set, will verbosely describe TRT engine building and deserialization as well as TRT inference") - parser.add_argument("--wav", type=str, help='absolute path to .wav file (16KHz)') - parser.add_argument("--max_workspace_size", default=0, type=int, help="Maximum GPU memory workspace size for constructed engine; needed when building") - parser.add_argument("--transpose", action="store_true", default=False, help="If set, will transpose input") - parser.add_argument("--static_shape", action="store_true", default=False, help="If set, use static shape otherwise dynamic shape. Dynamic shape is always preferred.") - - return parser.parse_args() - -if __name__ == "__main__": - args = parse_args() - - main(args) diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/perfprocedures.py b/PyTorch/SpeechRecognition/Jasper/tensorrt/perfprocedures.py deleted file mode 100644 index e28ad28e..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/perfprocedures.py +++ /dev/null @@ -1,182 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -'''A collection of accuracy and latency evaluation procedures for JASPER on PyTorch and TRT. -''' - - -import pycuda.driver as cuda -import pycuda.autoinit -import perfutils -import trtutils -import time -import torch -from tqdm import tqdm - -def compare_times_trt_pyt_exhaustive(engine, pyt_components, args): - '''Compares execution times and WER between TRT and PyTorch''' - - # The engine has a fixed-size sequence length, which needs to be known for slicing/padding input - preprocess_times = [] - inputadjust_times = [] - outputadjust_times = [] - process_batch_times = [] - trt_solo_times = [] - trt_async_times = [] - tohost_sync_times =[] - pyt_infer_times = [] - step_counter = 0 - - with engine.create_execution_context() as context, torch.no_grad(): - for data in tqdm(pyt_components['data_layer'].data_iterator): - if args.num_steps >= 1: - if step_counter > args.num_steps: - break - step_counter +=1 - tensors = [] - for d in data: - tensors.append(d.cuda()) - preprocess_start = time.perf_counter() - am_input = pyt_components['audio_preprocessor'](tensors[0], tensors[1]) - - torch.cuda.synchronize() - preprocess_end = time.perf_counter() - - # Pad or cut to the neccessary engine length - inputadjust_start = time.perf_counter() - am_input = perfutils.adjust_shape(am_input, args) - torch.cuda.synchronize() - inputadjust_end = time.perf_counter() - - batch_size = am_input[0].shape[0] - - inp = [am_input[0]] - - # Run TRT inference 1: Async copying and inference - # import ipdb; ipdb.set_trace() - trt_out, time_taken= do_inference_overlap(context, inp) - torch.cuda.synchronize() - outputadjust_start = time.perf_counter() - outputadjust_end = time.perf_counter() - process_batch_start = time.perf_counter() - perfutils.global_process_batch(log_probs=trt_out, - original_tensors=tensors, - batch_size=batch_size, - is_trt=True) - torch.cuda.synchronize() - process_batch_end = time.perf_counter() - - # Create explicit stream so pytorch doesn't complete asynchronously - pyt_infer_start = time.perf_counter() - pyt_out = pyt_components['acoustic_model'](am_input[0]) - torch.cuda.synchronize() - pyt_infer_end = time.perf_counter() - perfutils.global_process_batch(log_probs=pyt_out, - original_tensors=tensors, - batch_size=batch_size, - is_trt=False) - # Run TRT inference 2: Synchronous copying and inference - sync_out, time_to, time_infer, time_from = do_inference(context,inp) - del sync_out - preprocess_times.append(preprocess_end - preprocess_start) - inputadjust_times.append(inputadjust_end - inputadjust_start) - outputadjust_times.append(outputadjust_end - outputadjust_start) - process_batch_times.append(process_batch_end - process_batch_start) - trt_solo_times.append(time_infer) - trt_async_times.append(time_taken) - tohost_sync_times.append(time_from) - pyt_infer_times.append(pyt_infer_end - pyt_infer_start) - - trt_wer = perfutils.global_process_epoch(is_trt=True) - pyt_wer = perfutils.global_process_epoch(is_trt=False) - trt_preds = perfutils._global_trt_dict['predictions'] - pyt_preds = perfutils._global_pyt_dict['predictions'] - times = { - 'preprocess': preprocess_times, # Time to go through preprocessing - 'pyt_infer': pyt_infer_times, # Time for batch completion through pytorch - 'input_adjust': inputadjust_times, # Time to pad/cut for TRT engine size requirements - 'output_adjust' : outputadjust_times, # Time to reshape output of TRT and copy from host to device - 'post_process': process_batch_times, # Time to run greedy decoding and do CTC conversion - 'trt_solo_infer': trt_solo_times, # Time to execute just TRT acoustic model - 'to_host': tohost_sync_times, # Time to execute device to host copy synchronously - 'trt_async_infer': trt_async_times, # Time to execute combined async TRT acoustic model + device to host copy - - } - wer = { - 'trt': trt_wer, - 'pyt': pyt_wer - } - preds = { - 'trt': trt_preds, - 'pyt': pyt_preds - } - return wer, preds, times - -def do_inference(context, inp): - '''Do inference using a TRT engine and time it - Execution and device-to-host copy are completed synchronously - ''' - # Typical Python-TRT used in samples would copy input data from host to device. - # Because the PyTorch Tensor is already on the device, such a copy is unneeded. - t0 = time.perf_counter() - stream = cuda.Stream() - # Create output buffers and stream - outputs, bindings, out_shape = trtutils.allocate_buffers_with_existing_inputs(context, inp) - t01 = time.perf_counter() - # simulate sync call here - context.execute_async_v2( - bindings=bindings, - stream_handle=stream.handle) - stream.synchronize() - - t2 = time.perf_counter() - # for out in outputs: - # cuda.memcpy_dtoh(out.host, out.device) - [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs] - stream.synchronize() - - t3 = time.perf_counter() - copyto = t01-t0 - inference = t2-t01 - copyfrom = t3-t2 - out = outputs[0].host - outputs[0].device.free() - out = perfutils.torchify_trt_out(outputs[0].host, out_shape) - return out, copyto, inference, copyfrom - -def do_inference_overlap(context, inp): - '''Do inference using a TRT engine and time it - Execution and device-to-host copy are completed asynchronously - ''' - # Typical Python-TRT used in samples would copy input data from host to device. - # Because the PyTorch Tensor is already on the device, such a copy is unneeded. - - t0 = time.perf_counter() - # Create output buffers and stream - stream = cuda.Stream() - outputs, bindings, out_shape = trtutils.allocate_buffers_with_existing_inputs(context, inp) - t01 = time.perf_counter() - t1 = time.perf_counter() - # Run inference and transfer outputs to host asynchronously - context.execute_async_v2( - bindings=bindings, - stream_handle=stream.handle) - [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs] - stream.synchronize() - t2 = time.perf_counter() - copyto = t1-t0 - inference = t2-t1 - outputs[0].device.free() - out = perfutils.torchify_trt_out(outputs[0].host, out_shape) - return out, t2-t1 diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/perfutils.py b/PyTorch/SpeechRecognition/Jasper/tensorrt/perfutils.py deleted file mode 100644 index 483c6411..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/perfutils.py +++ /dev/null @@ -1,317 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -'''Contains helper functions for non-TRT components of JASPER inference -''' - -from model import GreedyCTCDecoder, AudioPreprocessing, JasperEncoderDecoder -from dataset import AudioToTextDataLayer -from helpers import process_evaluation_batch, process_evaluation_epoch, add_ctc_labels, norm -from apex import amp -import torch -import torch.nn as nn -import toml -from parts.features import audio_from_file -import onnx -import os - -_global_ctc_labels = None -def get_vocab(): - ''' Gets the CTC vocab - - Requires calling get_pytorch_components_and_onnx() to setup global labels. - ''' - if _global_ctc_labels is None: - raise Exception("Feature labels have not been found. Execute `get_pytorch_components_and_onnx()` first") - - return _global_ctc_labels - -def get_results(log_probs, original_tensors, batch_size): - ''' Returns WER and predictions for the outputs of the acoustic model - - Used for one-off batches. Epoch-wide evaluation should use - global_process_batch and global_process_epoch - ''' - # Used to get WER and predictions for one-off batches - greedy_decoder = GreedyCTCDecoder() - predicts = norm(greedy_decoder(log_probs=log_probs)) - values_dict = dict( - predictions=[predicts], - transcript=[original_tensors[2][0:batch_size,...]], - transcript_length=[original_tensors[3][0:batch_size,...]], - ) - temp_dict = { - 'predictions': [], - 'transcripts': [], - } - process_evaluation_batch(values_dict, temp_dict, labels=get_vocab()) - predictions = temp_dict['predictions'] - wer, _ = process_evaluation_epoch(temp_dict) - return wer, predictions - - -_global_trt_dict = { - 'predictions': [], - 'transcripts': [], -} -_global_pyt_dict = { - 'predictions': [], - 'transcripts': [], -} - -def global_process_batch(log_probs, original_tensors, batch_size, is_trt=True): - '''Accumulates prediction evaluations for batches across an epoch - - is_trt determines which global dictionary will be used. - To get WER at any point, use global_process_epoch. - For one-off WER evaluations, use get_results() - ''' - # State-based approach for full WER comparison across a dataset. - greedy_decoder = GreedyCTCDecoder() - predicts = norm(greedy_decoder(log_probs=log_probs)) - values_dict = dict( - predictions=[predicts], - transcript=[original_tensors[2][0:batch_size,...]], - transcript_length=[original_tensors[3][0:batch_size,...]], - ) - dict_to_process = _global_trt_dict if is_trt else _global_pyt_dict - process_evaluation_batch(values_dict, dict_to_process, labels=get_vocab()) - - -def global_process_epoch(is_trt=True): - '''Returns WER in accumulated global dictionary - ''' - dict_to_process = _global_trt_dict if is_trt else _global_pyt_dict - wer, _ = process_evaluation_epoch(dict_to_process) - return wer - - - -def get_onnx(path, acoustic_model, args): - ''' Get an ONNX model with float weights - - Requires an --onnx_save_path and --ckpt_path (so that an acoustic model could be constructed). - Fixed-length --seq_len must be provided as well. - ''' - - dynamic_dim = 0 - if not args.static_shape: - dynamic_dim = 1 if args.transpose else 2 - - - if args.transpose: - signal_shape=(args.engine_batch_size, int(args.seq_len), 64) - else: - signal_shape=(args.engine_batch_size, 64, int(args.seq_len)) - - with torch.no_grad(): - phony_signal = torch.zeros(signal_shape, dtype=torch.float, device=torch.device("cuda")) - phony_len = torch.IntTensor(len(phony_signal)) - phony_out = acoustic_model.infer((phony_signal, phony_len)) - - input_names=["FEATURES"] - output_names=["LOGITS"] - - if acoustic_model.jasper_encoder.use_conv_mask: - input_names.append("FETURES_LEN") - output_names.append("LOGITS_LEN") - phony_signal = [phony_signal, phony_len] - - if dynamic_dim > 0: - dynamic_axes={ - "FEATURES" : {0 : "BATCHSIZE", dynamic_dim : "NUM_FEATURES"}, - "LOGITS" : { 0: "BATCHSIZE", 1 : "NUM_LOGITS"} - } - else: - dynamic_axes = None - - jitted_model = acoustic_model - - torch.onnx.export(jitted_model, phony_signal, path, - input_names=input_names, output_names=output_names, - opset_version=10, - do_constant_folding=True, - verbose=True, - dynamic_axes=dynamic_axes, - example_outputs = phony_out - ) - - fn=path+".readable" - with open(fn, 'w') as f: - #Write human-readable graph representation to file as well. - tempModel = onnx.load(path) - onnx.checker.check_model(tempModel) - pgraph = onnx.helper.printable_graph(tempModel.graph) - f.write(pgraph) - - return path - - -def get_pytorch_components_and_onnx(args): - '''Returns PyTorch components used for inference - ''' - model_definition = toml.load(args.model_toml) - dataset_vocab = model_definition['labels']['labels'] - # Set up global labels for future vocab calls - global _global_ctc_labels - _global_ctc_labels= add_ctc_labels(dataset_vocab) - featurizer_config = model_definition['input_eval'] - - optim_level = 3 if args.pyt_fp16 else 0 - - featurizer_config["optimization_level"] = optim_level - - audio_preprocessor = None - onnx_path = None - data_layer = None - wav = None - seq_len = None - - if args.max_duration is not None: - featurizer_config['max_duration'] = args.max_duration - if args.dataset_dir is not None: - data_layer = AudioToTextDataLayer(dataset_dir=args.dataset_dir, - featurizer_config=featurizer_config, - manifest_filepath=args.val_manifest, - labels=dataset_vocab, - batch_size=args.batch_size, - shuffle=False) - if args.wav is not None: - args.batch_size=1 - wav, seq_len = audio_from_file(args.wav) - if args.seq_len is None or args.seq_len == 0: - args.seq_len = seq_len/(featurizer_config['sample_rate']/100) - args.seq_len = int(args.seq_len) - - if args.transpose: - featurizer_config["transpose_out"] = True - model_definition["transpose_in"] = True - - model = JasperEncoderDecoder(jasper_model_definition=model_definition, feat_in=1024, num_classes=len(get_vocab()), transpose_in=args.transpose) - model = model.cuda() - model.eval() - - audio_preprocessor = AudioPreprocessing(**featurizer_config) - audio_preprocessor = audio_preprocessor.cuda() - audio_preprocessor.eval() - - if args.ckpt_path is not None: - if os.path.isdir(args.ckpt_path): - d_checkpoint = torch.load(args.ckpt_path+"/decoder.pt", map_location="cpu") - e_checkpoint = torch.load(args.ckpt_path+"/encoder.pt", map_location="cpu") - model.jasper_encoder.load_state_dict(e_checkpoint, strict=False) - model.jasper_decoder.load_state_dict(d_checkpoint, strict=False) - else: - checkpoint = torch.load(args.ckpt_path, map_location="cpu") - model.load_state_dict(checkpoint['state_dict'], strict=False) - - # if we are to produce engine, not run/create ONNX, postpone AMP initialization - # (ONNX parser cannot handle mixed FP16 ONNX yet) - if args.pyt_fp16 and args.engine_path is None: - amp.initialize(models=model, opt_level='O'+str(optim_level)) - - if args.make_onnx: - if args.onnx_path is None or args.ckpt_path is None: - raise Exception("--ckpt_path, --onnx_path must be provided when using --make_onnx") - onnx_path = get_onnx(args.onnx_path, model, args) - - if args.pyt_fp16 and args.engine_path is not None: - amp.initialize(models=model, opt_level='O'+str(optim_level)) - - return {'data_layer': data_layer, - 'audio_preprocessor': audio_preprocessor, - 'acoustic_model': model, - 'input_wav' : (wav, seq_len) }, onnx_path - -def adjust_shape(am_input, args): - '''Pads or cuts acoustic model input tensor to some fixed_length - - ''' - input = am_input[0] - baked_length = int(args.seq_len) - - if args.transpose: - in_seq_len = input.shape[1] - else: - in_seq_len = input.shape[2] - - if baked_length is None or in_seq_len == baked_length: - return (input, am_input[1]) - - if args.transpose: - return (input.resize_(input.shape[0], baked_length, 64), am_input[1]) - - newSeq=input - if in_seq_len > baked_length: - # Cut extra bits off, no inference done - newSeq = input[...,0:baked_length].contiguous() - elif in_seq_len < baked_length: - # Zero-pad to satisfy length - pad_length = baked_length - in_seq_len - newSeq = nn.functional.pad(input, (0, pad_length), 'constant', 0) - return (newSeq, am_input[1]) - -def torchify_trt_out(trt_out, desired_shape): - '''Reshapes flat data to format for greedy+CTC decoding - Used to convert numpy array on host to PyT Tensor - ''' - # Predictions must be reshaped. - ret = torch.from_numpy(trt_out) - return ret.reshape((desired_shape[0], desired_shape[1], desired_shape[2])) - -def do_csv_export(wers, times, batch_size, num_frames): - '''Produces CSV header and data for input data - - wers: dictionary of WER with keys={'trt', 'pyt'} - times: dictionary of execution times - ''' - def take_durations_and_output_percentile(durations, ratios): - from heapq import nlargest, nsmallest - import numpy as np - import math - durations = np.asarray(durations) * 1000 # in ms - latency = durations - # The first few entries may not be representative due to warm-up effects - # The last entry might not be representative if dataset_size % batch_size != 0 - latency = latency[5:-1] - mean_latency = np.mean(latency) - latency_worst = nlargest(math.ceil( (1 - min(ratios))* len(latency)), latency) - latency_ranges=get_percentile(ratios, latency_worst, len(latency)) - latency_ranges["0.5"] = mean_latency - return latency_ranges - def get_percentile(ratios, arr, nsamples): - res = {} - for a in ratios: - idx = max(int(nsamples * (1 - a)), 0) - res[a] = arr[idx] - return res - - ratios = [0.9, 0.95, 0.99, 1.] - header=[] - data=[] - header.append("BatchSize") - header.append("NumFrames") - data.append(f"{batch_size}") - data.append(f"{num_frames}") - for title, wer in wers.items(): - header.append(title) - data.append(f"{wer}") - for title, durations in times.items(): - ratio_latencies_dict = take_durations_and_output_percentile(durations, ratios) - for ratio, latency in ratio_latencies_dict.items(): - header.append(f"{title}_{ratio}") - data.append(f"{latency}") - string_header = ", ".join(header) - string_data = ", ".join(data) - return string_header, string_data - diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/requirements.txt b/PyTorch/SpeechRecognition/Jasper/tensorrt/requirements.txt deleted file mode 100644 index 661ee58f..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/requirements.txt +++ /dev/null @@ -1,4 +0,0 @@ -pycuda -pillow -onnx==1.6.0 -onnxruntime==1.4.0 diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/docker/build.sh b/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/docker/build.sh deleted file mode 100755 index b4790974..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/docker/build.sh +++ /dev/null @@ -1,5 +0,0 @@ -#!/bin/bash - -# Constructs a docker image containing dependencies for execution of JASPER through TensorRT -echo "docker build . -f ./tensorrt/Dockerfile -t jasper:tensorrt" -docker build . -f ./tensorrt/Dockerfile -t jasper:tensorrt diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/docker/launch.sh b/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/docker/launch.sh deleted file mode 100755 index 4eda3f17..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/docker/launch.sh +++ /dev/null @@ -1,43 +0,0 @@ -#!/bin/bash -SCRIPT_DIR=$(cd $(dirname $0); pwd) -JASPER_REPO=${JASPER_REPO:-"${SCRIPT_DIR}/../../.."} - -# Launch TRT JASPER container. - -DATA_DIR=$1 -CHECKPOINT_DIR=$2 -RESULT_DIR=$3 -PROGRAM_PATH=${PROGRAM_PATH} - -if [ $# -lt 3 ]; then - echo "Usage: ./launch.sh ()" - echo "All directory paths must be absolute paths and exist" - exit 1 -fi - -for dir in $DATA_DIR $CHECKPOINT_DIR $RESULT_DIR; do - if [[ $dir != /* ]]; then - echo "All directory paths must be absolute paths!" - echo "${dir} is not an absolute path" - exit 1 - fi - - if [ ! -d $dir ]; then - echo "All directory paths must exist!" - echo "${dir} does not exist" - exit 1 - fi -done - - -nvidia-docker run -it --rm \ - --runtime=nvidia \ - --shm-size=4g \ - --ulimit memlock=-1 \ - --ulimit stack=67108864 \ - -v $DATA_DIR:/datasets \ - -v $CHECKPOINT_DIR:/checkpoints/ \ - -v $RESULT_DIR:/results/ \ - -v ${JASPER_REPO}:/jasper \ - ${EXTRA_JASPER_ENV} \ - jasper:tensorrt bash $PROGRAM_PATH diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/inference.sh b/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/inference.sh deleted file mode 100755 index 27c3e81e..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/inference.sh +++ /dev/null @@ -1,67 +0,0 @@ -#!/bin/bash -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -# Performs inference and measures latency and accuracy of TRT and PyTorch implementations of JASPER. - -echo "Container nvidia build = " $NVIDIA_BUILD_ID - -# Mandatory Arguments -CHECKPOINT=$CHECKPOINT - -# Arguments with Defaults -DATA_DIR=${DATA_DIR:-"/datasets/LibriSpeech"} -DATASET=${DATASET:-"dev-clean"} -RESULT_DIR=${RESULT_DIR:-"/results"} -CREATE_LOGFILE=${CREATE_LOGFILE:-"true"} -TRT_PRECISION=${TRT_PRECISION:-"fp32"} -PYTORCH_PRECISION=${PYTORCH_PRECISION:-"fp32"} -NUM_STEPS=${NUM_STEPS:-"-1"} -BATCH_SIZE=${BATCH_SIZE:-1} -NUM_FRAMES=${NUM_FRAMES:-3600} -MAX_SEQUENCE_LENGTH_FOR_ENGINE=${MAX_SEQUENCE_LENGTH_FOR_ENGINE:-$NUM_FRAMES} -FORCE_ENGINE_REBUILD=${FORCE_ENGINE_REBUILD:-"true"} -CSV_PATH=${CSV_PATH:-"/results/res.csv"} -TRT_PREDICTION_PATH=${TRT_PREDICTION_PATH:-"/results/trt_predictions.txt"} -PYT_PREDICTION_PATH=${PYT_PREDICTION_PATH:-"/results/pyt_predictions.txt"} -VERBOSE=${VERBOSE:-"false"} - - - -export CHECKPOINT="$CHECKPOINT" -export DATA_DIR="$DATA_DIR" -export DATASET="$DATASET" -export RESULT_DIR="$RESULT_DIR" -export CREATE_LOGFILE="$CREATE_LOGFILE" -export TRT_PRECISION="$TRT_PRECISION" -export PYTORCH_PRECISION="$PYTORCH_PRECISION" -export NUM_STEPS="$NUM_STEPS" -export BATCH_SIZE="$BATCH_SIZE" -export NUM_FRAMES="$NUM_FRAMES" -export MAX_SEQUENCE_LENGTH_FOR_ENGINE="$MAX_SEQUENCE_LENGTH_FOR_ENGINE" -export FORCE_ENGINE_REBUILD="$FORCE_ENGINE_REBUILD" -export CSV_PATH="$CSV_PATH" -export TRT_PREDICTION_PATH="$TRT_PREDICTION_PATH" -export PYT_PREDICTION_PATH="$PYT_PREDICTION_PATH" -export VERBOSE="$VERBOSE" - -bash ./tensorrt/scripts/inference_benchmark.sh $1 $2 $3 $4 $5 $6 $7 - -trt_word_error_rate=`cat "$CSV_PATH" | awk '{print $3}'` -pyt_word_error_rate=`cat "$CSV_PATH" | awk '{print $4}'` - -echo "word error rate for native PyTorch inference: " -echo "${pyt_word_error_rate}" -echo "word error rate for native TRT inference: " -echo "${trt_word_error_rate}" diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/inference_benchmark.sh b/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/inference_benchmark.sh deleted file mode 100755 index d775e8f6..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/inference_benchmark.sh +++ /dev/null @@ -1,173 +0,0 @@ -#!/bin/bash -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# Measures latency and accuracy of TRT and PyTorch implementations of JASPER. - -echo "Container nvidia build = " $NVIDIA_BUILD_ID - -trap "exit" INT - - -# Mandatory Arguments -CHECKPOINT=${CHECKPOINT:-"/checkpoints/jasper_fp16.pt"} - -# Arguments with Defaults -DATA_DIR=${DATA_DIR:-"/datasets/LibriSpeech"} -DATASET=${DATASET:-"dev-clean"} -RESULT_DIR=${RESULT_DIR:-"/results"} -LOG_DIR=${RESULT_DIR}/logs -CREATE_LOGFILE=${CREATE_LOGFILE:-"true"} -TRT_PRECISION=${TRT_PRECISION:-"fp16"} -PYTORCH_PRECISION=${PYTORCH_PRECISION:-"fp16"} -NUM_STEPS=${NUM_STEPS:-"100"} -BATCH_SIZE=${BATCH_SIZE:-64} -NUM_FRAMES=${NUM_FRAMES:-512} -FORCE_ENGINE_REBUILD=${FORCE_ENGINE_REBUILD:-"false"} -CSV_PATH=${CSV_PATH:-"/results/res.csv"} -TRT_PREDICTION_PATH=${TRT_PREDICTION_PATH:-"none"} -PYT_PREDICTION_PATH=${PYT_PREDICTION_PATH:-"none"} -VERBOSE=${VERBOSE:-"false"} -USE_DYNAMIC_SHAPE=${USE_DYNAMIC_SHAPE:-"yes"} - - -# Set up flag-based arguments -TRT_PREC="" -if [ "$TRT_PRECISION" = "fp16" ] ; then - TRT_PREC="--trt_fp16" -elif [ "$TRT_PRECISION" = "fp32" ] ; then - TRT_PREC="" -else - echo "Unknown argument" - exit -2 -fi - -PYTORCH_PREC="" -if [ "$PYTORCH_PRECISION" = "fp16" ] ; then - PYTORCH_PREC="--pyt_fp16" -elif [ "$PYTORCH_PRECISION" = "fp32" ] ; then - PYTORCH_PREC="" -else - echo "Unknown argument" - exit -2 -fi - -SHOULD_VERBOSE="" -if [ "$VERBOSE" = "true" ] ; then - SHOULD_VERBOSE="--verbose" -fi - -STEPS="" -if [ "$NUM_STEPS" -gt 0 ] ; then - STEPS=" --num_steps $NUM_STEPS" -fi - -# Making engine and onnx directories in RESULT_DIR if they don't already exist -ONNX_DIR=$RESULT_DIR/onnxs -ENGINE_DIR=$RESULT_DIR/engines -mkdir -p $ONNX_DIR -mkdir -p $ENGINE_DIR -mkdir -p $LOG_DIR - - - -if [ "$USE_DYNAMIC_SHAPE" = "no" ] ; then - PREFIX=BS${BATCH_SIZE}_NF${NUM_FRAMES} - DYNAMIC_PREFIX=" --static_shape " -else - PREFIX=DYNAMIC -fi - -# Currently, TRT parser for ONNX can't parse mixed-precision weights, so ONNX -# export will always be FP32. This is also enforced in perf.py -ONNX_FILE=fp32_${PREFIX}.onnx -ENGINE_FILE=${TRT_PRECISION}_${PREFIX}.engine - - -# If an ONNX with the same precision and number of frames exists, don't recreate it because -# TRT engine construction can be done on an onnx of any batch size -# "%P" only prints filenames (rather than absolute/relative path names) -EXISTING_ONNX=$(find $ONNX_DIR -name ${ONNX_FILE} -printf "%P\n" | head -n 1) -SHOULD_MAKE_ONNX="" -if [ -z "$EXISTING_ONNX" ] ; then - SHOULD_MAKE_ONNX="--make_onnx" -else - ONNX_FILE=${EXISTING_ONNX} -fi - -# Follow FORCE_ENGINE_REBUILD about reusing existing engines. -# If false, the existing engine must match precision, batch size, and number of frames -SHOULD_MAKE_ENGINE="" -if [ "$FORCE_ENGINE_REBUILD" != "true" ] ; then - EXISTING_ENGINE=$(find $ENGINE_DIR -name "${ENGINE_FILE}") - if [ -n "$EXISTING_ENGINE" ] ; then - SHOULD_MAKE_ENGINE="--use_existing_engine" - fi -fi - - - -if [ "$TRT_PREDICTION_PATH" = "none" ] ; then - TRT_PREDICTION_PATH="" -else - TRT_PREDICTION_PATH=" --trt_prediction_path=${TRT_PREDICTION_PATH}" -fi - - -if [ "$PYT_PREDICTION_PATH" = "none" ] ; then - PYT_PREDICTION_PATH="" -else - PYT_PREDICTION_PATH=" --pyt_prediction_path=${PYT_PREDICTION_PATH}" -fi - -CMD="python tensorrt/perf.py" -CMD+=" --batch_size $BATCH_SIZE" -CMD+=" --engine_batch_size $BATCH_SIZE" -CMD+=" --model_toml configs/jasper10x5dr_nomask.toml" -CMD+=" --dataset_dir $DATA_DIR" -CMD+=" --val_manifest $DATA_DIR/librispeech-${DATASET}-wav.json " -CMD+=" --ckpt_path $CHECKPOINT" -CMD+=" $SHOULD_VERBOSE" -CMD+=" $TRT_PREC" -CMD+=" $PYTORCH_PREC" -CMD+=" $STEPS" -CMD+=" --engine_path ${RESULT_DIR}/engines/${ENGINE_FILE}" -CMD+=" --onnx_path ${RESULT_DIR}/onnxs/${ONNX_FILE}" -CMD+=" --seq_len $NUM_FRAMES" -CMD+=" $SHOULD_MAKE_ONNX" -CMD+=" $SHOULD_MAKE_ENGINE" -CMD+=" $DYNAMIC_PREFIX" -CMD+=" --csv_path $CSV_PATH" -CMD+=" $1 $2 $3 $4 $5 $6 $7 $8 $9" -CMD+=" $TRT_PREDICTION_PATH" -CMD+=" $PYT_PREDICTION_PATH" - - -if [ "$CREATE_LOGFILE" == "true" ] ; then - export GBS=$(expr $BATCH_SIZE ) - printf -v TAG "jasper_tensorrt_inference_benchmark_%s_gbs%d" "$PYTORCH_PRECISION" $GBS - DATESTAMP=`date +'%y%m%d%H%M%S'` - LOGFILE=$LOG_DIR/$TAG.$DATESTAMP.log - printf "Logs written to %s\n" "$LOGFILE" -fi - -mkdir -p ${RESULT_DIR}/logs - -set -x -if [ -z "$LOGFILE" ] ; then - $CMD -else - $CMD |& tee $LOGFILE - grep 'latency' $LOGFILE -fi -set +x diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/walk_benchmark.sh b/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/walk_benchmark.sh deleted file mode 100755 index e70f9843..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/scripts/walk_benchmark.sh +++ /dev/null @@ -1,43 +0,0 @@ -#!/bin/bash -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# A usage example of inference_benchmark.sh. - - -export NUM_STEPS=100 -export FORCE_ENGINE_REBUILD="false" -export CHECKPOINT=${CHECKPOINT:-"/checkpoints/jasper_fp16.pt"} -export CREATE_LOGFILE="true" -prec=fp16 -export TRT_PRECISION=$prec -export PYTORCH_PRECISION=$prec - -trap "exit" INT - -for use_dynamic in yes no; -do - export USE_DYNAMIC_SHAPE=${use_dynamic} - export CSV_PATH="/results/${prec}.csv" - for nf in 208 304 512 704 1008 1680; - do - export NUM_FRAMES=$nf - for bs in 1 2 4 8 16 32 64; - do - export BATCH_SIZE=$bs - - echo "Doing batch size ${bs}, sequence length ${nf}, precision ${prec}" - bash tensorrt/scripts/inference_benchmark.sh $1 $2 $3 $4 $5 $6 - done - done -done diff --git a/PyTorch/SpeechRecognition/Jasper/tensorrt/trtutils.py b/PyTorch/SpeechRecognition/Jasper/tensorrt/trtutils.py deleted file mode 100644 index 74580330..00000000 --- a/PyTorch/SpeechRecognition/Jasper/tensorrt/trtutils.py +++ /dev/null @@ -1,155 +0,0 @@ -# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -'''Contains helper functions for TRT components of JASPER inference -''' -import pycuda.driver as cuda -import tensorrt as trt -import onnxruntime as ort -import numpy as np - -class HostDeviceMem(object): - '''Type for managing host and device buffers - - A simple class which is more explicit that dealing with a 2-tuple. - ''' - def __init__(self, host_mem, device_mem): - self.host = host_mem - self.device = device_mem - - def __str__(self): - return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device) - - def __repr__(self): - return self.__str__() - -def build_engine_from_parser(args): - '''Builds TRT engine from an ONNX file - Note that network output 1 is unmarked so that the engine will not use - vestigial length calculations associated with masked_fill - ''' - TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) if args.verbose else trt.Logger(trt.Logger.WARNING) - builder = trt.Builder(TRT_LOGGER) - builder.max_batch_size = 64 - - if args.trt_fp16: - builder.fp16_mode = True - print("Optimizing for FP16") - config_flags = 1 << int(trt.BuilderFlag.FP16) # | 1 << int(trt.BuilderFlag.STRICT_TYPES) - max_size = 4*1024*1024*1024 - max_len = args.max_seq_len - else: - config_flags = 0 - max_size = 4*1024*1024*1024 - max_len = args.max_seq_len - if args.max_workspace_size > 0: - builder.max_workspace_size = args.max_workspace_size - else: - builder.max_workspace_size = max_size - - config = builder.create_builder_config() - config.flags = config_flags - - if not args.static_shape: - profile = builder.create_optimization_profile() - if args.transpose: - profile.set_shape("FEATURES", min=(1,192,64), opt=(args.engine_batch_size,256,64), max=(builder.max_batch_size, max_len, 64)) - else: - profile.set_shape("FEATURES", min=(1,64,192), opt=(args.engine_batch_size,64,256), max=(builder.max_batch_size, 64, max_len)) - config.add_optimization_profile(profile) - explicit_batch = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) - network = builder.create_network(explicit_batch) - - with trt.OnnxParser(network, TRT_LOGGER) as parser: - with open(args.onnx_path, 'rb') as model: - parsed = parser.parse(model.read()) - print ("Parsing returned ", parsed, "dynamic_shape= " , not args.static_shape, "\n") - return builder.build_engine(network, config=config) - -def deserialize_engine(engine_path, is_verbose): - '''Deserializes TRT engine at engine_path - ''' - TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE) if is_verbose else trt.Logger(trt.Logger.WARNING) - with open(engine_path, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime: - engine = runtime.deserialize_cuda_engine(f.read()) - return engine - - -def allocate_buffers_with_existing_inputs(context, inp): - ''' - allocate_buffers() (see TRT python samples) but uses an existing inputs on device - - inp: List of pointers to device memory. Pointers are in the same order as - would be produced by allocate_buffers(). That is, inputs are in the - order defined by iterating through `engine` - ''' - # Add input to bindings - bindings = [0,0] - outputs = [] - engine = context.engine - batch_size = inp[0].shape - inp_idx = engine.get_binding_index("FEATURES") - inp_b = inp[0].data_ptr() - assert(inp[0].is_contiguous()) - bindings[inp_idx] = inp_b - sh = inp[0].shape - batch_size = sh[0] - orig_shape = context.get_binding_shape(inp_idx) - if orig_shape[0]==-1: - context.set_binding_shape(inp_idx, trt.Dims([batch_size, sh[1], sh[2]])) - - assert context.all_binding_shapes_specified - - out_idx = engine.get_binding_index("LOGITS") - # Allocate output buffer by querying the size from the context. This may be different for different input shapes. - out_shape = context.get_binding_shape(out_idx) - #print ("Out_shape: ", out_shape) - h_output = cuda.pagelocked_empty(tuple(out_shape), dtype=np.float32()) - # print ("Out bytes: " , h_output.nbytes) - d_output = cuda.mem_alloc(h_output.nbytes) - bindings[out_idx] = int(d_output) - hdm = HostDeviceMem(h_output, d_output) - outputs.append(hdm) - return outputs, bindings, out_shape - -def get_engine(args): - '''Get a TRT engine - - If --should_serialize is present, always build from ONNX and store result in --engine_path. - Else If an engine is provided as an argument (--engine_path) use that one. - Otherwise, make one from onnx (--onnx_load_path), but don't serialize it. - ''' - engine = None - - if args.engine_path is not None and args.use_existing_engine: - engine = deserialize_engine(args.engine_path, args.verbose) - elif args.engine_path is not None and args.onnx_path is not None: - # Build a new engine and serialize it. - print("Building TRT engine ....") - engine = build_engine_from_parser(args) - if engine is not None: - with open(args.engine_path, 'wb') as f: - f.write(engine.serialize()) - print("TRT engine saved at " + args.engine_path + " ...") - elif args.onnx_path is not None: - ort_session = ort.InferenceSession(args.onnx_path) - return ort_session - else: - raise Exception("One of the following sets of arguments must be provided:\n"+ - " + --use_existing_engine\n"+ - " + \n"+ - "in order to construct a TRT engine") - if engine is None: - raise Exception("Failed to acquire TRT engine") - - return engine diff --git a/PyTorch/SpeechRecognition/Jasper/train.py b/PyTorch/SpeechRecognition/Jasper/train.py index aed09e8e..117abc64 100644 --- a/PyTorch/SpeechRecognition/Jasper/train.py +++ b/PyTorch/SpeechRecognition/Jasper/train.py @@ -14,449 +14,488 @@ import argparse import copy -import itertools -import math import os import random import time -import toml +import pyprof import torch import numpy as np +import torch.cuda.profiler as profiler +import torch.distributed as dist from apex import amp +from apex.parallel import DistributedDataParallel -from dataset import AudioToTextDataLayer -from helpers import (add_ctc_labels, model_multi_gpu, monitor_asr_train_progress, - print_dict, print_once, process_evaluation_batch, - process_evaluation_epoch) -from model import AudioPreprocessing, CTCLossNM, GreedyCTCDecoder, Jasper -from optimizers import Novograd, AdamW +from common import helpers +from common.dali.data_loader import DaliDataLoader +from common.dataset import AudioDataset, get_data_loader +from common.features import BaseFeatures, FilterbankFeatures +from common.helpers import (Checkpointer, greedy_wer, num_weights, print_once, + process_evaluation_epoch) +from common.optimizers import AdamW, lr_policy, Novograd +from common.tb_dllogger import flush_log, init_log, log +from jasper import config +from jasper.model import CTCLossNM, GreedyCTCDecoder, Jasper -def lr_policy(initial_lr, step, N): - """ - learning rate decay - Args: - initial_lr: base learning rate - step: current iteration number - N: total number of iterations over which learning rate is decayed - """ - min_lr = 0.00001 - res = initial_lr * ((N - step) / N) ** 2 - return max(res, min_lr) +def parse_args(): + parser = argparse.ArgumentParser(description='Jasper') + + training = parser.add_argument_group('training setup') + training.add_argument('--epochs', default=400, type=int, + help='Number of epochs for the entire training; influences the lr schedule') + training.add_argument("--warmup_epochs", default=0, type=int, + help='Initial epochs of increasing learning rate') + training.add_argument("--hold_epochs", default=0, type=int, + help='Constant max learning rate epochs after warmup') + training.add_argument('--epochs_this_job', default=0, type=int, + help=('Run for a number of epochs with no effect on the lr schedule.' + 'Useful for re-starting the training.')) + training.add_argument('--cudnn_benchmark', action='store_true', default=True, + help='Enable cudnn benchmark') + training.add_argument('--amp', '--fp16', action='store_true', default=False, + help='Use mixed precision training') + training.add_argument('--seed', default=42, type=int, help='Random seed') + training.add_argument('--local_rank', default=os.getenv('LOCAL_RANK', 0), + type=int, help='GPU id used for distributed training') + training.add_argument('--pre_allocate_range', default=None, type=int, nargs=2, + help='Warmup with batches of length [min, max] before training') + training.add_argument('--pyprof', action='store_true', help='Enable pyprof profiling') + + optim = parser.add_argument_group('optimization setup') + optim.add_argument('--batch_size', default=32, type=int, + help='Global batch size') + optim.add_argument('--lr', default=1e-3, type=float, + help='Peak learning rate') + optim.add_argument("--min_lr", default=1e-5, type=float, + help='minimum learning rate') + optim.add_argument("--lr_policy", default='exponential', type=str, + choices=['exponential', 'legacy'], help='lr scheduler') + optim.add_argument("--lr_exp_gamma", default=0.99, type=float, + help='gamma factor for exponential lr scheduler') + optim.add_argument('--weight_decay', default=1e-3, type=float, + help='Weight decay for the optimizer') + optim.add_argument('--grad_accumulation_steps', default=1, type=int, + help='Number of accumulation steps') + optim.add_argument('--optimizer', default='novograd', type=str, + choices=['novograd', 'adamw'], help='Optimization algorithm') + optim.add_argument('--ema', type=float, default=0.0, + help='Discount factor for exp averaging of model weights') + + io = parser.add_argument_group('feature and checkpointing setup') + io.add_argument('--dali_device', type=str, choices=['none', 'cpu', 'gpu'], + default='gpu', help='Use DALI pipeline for fast data processing') + io.add_argument('--resume', action='store_true', + help='Try to resume from last saved checkpoint.') + io.add_argument('--ckpt', default=None, type=str, + help='Path to a checkpoint for resuming training') + io.add_argument('--save_frequency', default=10, type=int, + help='Checkpoint saving frequency in epochs') + io.add_argument('--keep_milestones', default=[100, 200, 300], type=int, nargs='+', + help='Milestone checkpoints to keep from removing') + io.add_argument('--save_best_from', default=380, type=int, + help='Epoch on which to begin tracking best checkpoint (dev WER)') + io.add_argument('--eval_frequency', default=200, type=int, + help='Number of steps between evaluations on dev set') + io.add_argument('--log_frequency', default=25, type=int, + help='Number of steps between printing training stats') + io.add_argument('--prediction_frequency', default=100, type=int, + help='Number of steps between printing sample decodings') + io.add_argument('--model_config', type=str, required=True, + help='Path of the model configuration file') + io.add_argument('--train_manifests', type=str, required=True, nargs='+', + help='Paths of the training dataset manifest file') + io.add_argument('--val_manifests', type=str, required=True, nargs='+', + help='Paths of the evaluation datasets manifest files') + io.add_argument('--max_duration', type=float, + help='Discard samples longer than max_duration') + io.add_argument('--pad_to_max_duration', action='store_true', default=False, + help='Pad training sequences to max_duration') + io.add_argument('--dataset_dir', required=True, type=str, + help='Root dir of dataset') + io.add_argument('--output_dir', type=str, required=True, + help='Directory for logs and checkpoints') + io.add_argument('--log_file', type=str, default=None, + help='Path to save the training logfile.') + return parser.parse_args() -def save(model, ema_model, optimizer, epoch, output_dir, optim_level): - """ - Saves model checkpoint - Args: - model: model - ema_model: model with exponential averages of weights - optimizer: optimizer - epoch: epoch of model training - output_dir: path to save model checkpoint - """ - out_fpath = os.path.join(output_dir, f"Jasper_epoch{epoch}_checkpoint.pt") - print_once(f"Saving {out_fpath}...") - - if torch.distributed.is_initialized(): - torch.distributed.barrier() - rank = torch.distributed.get_rank() - else: - rank = 0 - - if rank == 0: - checkpoint = { - 'epoch': epoch, - 'state_dict': getattr(model, 'module', model).state_dict(), - 'optimizer': optimizer.state_dict(), - 'amp': amp.state_dict() if optim_level > 0 else None, - - } - if ema_model is not None: - checkpoint['ema_state_dict'] = getattr(ema_model, 'module', ema_model).state_dict() - torch.save(checkpoint, out_fpath) - - print_once('Saved.') +def reduce_tensor(tensor, num_gpus): + rt = tensor.clone() + dist.all_reduce(rt, op=dist.ReduceOp.SUM) + return rt.true_divide(num_gpus) def apply_ema(model, ema_model, decay): if not decay: return - st = model.state_dict() - add_module = hasattr(model, 'module') and not hasattr(ema_model, 'module') - for k,v in ema_model.state_dict().items(): - if add_module and not k.startswith('module.'): - k = 'module.' + k - v.copy_(decay * v + (1 - decay) * st[k]) + + sd = getattr(model, 'module', model).state_dict() + for k, v in ema_model.state_dict().items(): + v.copy_(decay * v + (1 - decay) * sd[k]) -def train( - data_layer, - data_layer_eval, - model, - ema_model, - ctc_loss, - greedy_decoder, - optimizer, - optim_level, - labels, - multi_gpu, - args, - fn_lr_policy=None): - """Trains model - Args: - data_layer: training data layer - data_layer_eval: evaluation data layer - model: model ( encapsulates data processing, encoder, decoder) - ctc_loss: loss function - greedy_decoder: greedy ctc decoder - optimizer: optimizer - optim_level: AMP optimization level - labels: list of output labels - multi_gpu: true if multi gpu training - args: script input argument list - fn_lr_policy: learning rate adjustment function - """ - def eval(model, name=''): - """Evaluates model on evaluation dataset - """ - with torch.no_grad(): - _global_var_dict = { - 'EvalLoss': [], - 'predictions': [], - 'transcripts': [], - } - eval_dataloader = data_layer_eval.data_iterator - for data in eval_dataloader: - tensors = [] - for d in data: - if isinstance(d, torch.Tensor): - tensors.append(d.cuda()) - else: - tensors.append(d) - t_audio_signal_e, t_a_sig_length_e, t_transcript_e, t_transcript_len_e = tensors +@torch.no_grad() +def evaluate(epoch, step, val_loader, val_feat_proc, labels, model, + ema_model, ctc_loss, greedy_decoder, use_amp, use_dali=False): - model.eval() - if optim_level == 1: - with amp.disable_casts(): - t_processed_signal_e, t_processed_sig_length_e = audio_preprocessor(t_audio_signal_e, t_a_sig_length_e) - else: - t_processed_signal_e, t_processed_sig_length_e = audio_preprocessor(t_audio_signal_e, t_a_sig_length_e) - if jasper_encoder.use_conv_mask: - t_log_probs_e, t_encoded_len_e = model.forward((t_processed_signal_e, t_processed_sig_length_e)) - else: - t_log_probs_e = model.forward(t_processed_signal_e) - t_loss_e = ctc_loss(log_probs=t_log_probs_e, targets=t_transcript_e, input_length=t_encoded_len_e, target_length=t_transcript_len_e) - t_predictions_e = greedy_decoder(log_probs=t_log_probs_e) + for model, subset in [(model, 'dev'), (ema_model, 'dev_ema')]: + if model is None: + continue - values_dict = dict( - loss=[t_loss_e], - predictions=[t_predictions_e], - transcript=[t_transcript_e], - transcript_length=[t_transcript_len_e] - ) - process_evaluation_batch(values_dict, _global_var_dict, labels=labels) + model.eval() + start_time = time.time() + agg = {'losses': [], 'preds': [], 'txts': []} - # final aggregation across all workers and minibatches) and logging of results - wer, eloss = process_evaluation_epoch(_global_var_dict) - - if name != '': - name = '_' + name - - print_once(f"==========>>>>>>Evaluation{name} Loss: {eloss}\n") - print_once(f"==========>>>>>>Evaluation{name} WER: {wer}\n") - - print_once("Starting .....") - start_time = time.time() - - train_dataloader = data_layer.data_iterator - epoch = args.start_epoch - step = epoch * args.step_per_epoch - - audio_preprocessor = model.module.audio_preprocessor if hasattr(model, 'module') else model.audio_preprocessor - data_spectr_augmentation = model.module.data_spectr_augmentation if hasattr(model, 'module') else model.data_spectr_augmentation - jasper_encoder = model.module.jasper_encoder if hasattr(model, 'module') else model.jasper_encoder - - while True: - if multi_gpu: - data_layer.sampler.set_epoch(epoch) - print_once("Starting epoch {0}, step {1}".format(epoch, step)) - last_epoch_start = time.time() - batch_counter = 0 - average_loss = 0 - for data in train_dataloader: - tensors = [] - for d in data: - if isinstance(d, torch.Tensor): - tensors.append(d.cuda()) - else: - tensors.append(d) - - if batch_counter == 0: - - if fn_lr_policy is not None: - adjusted_lr = fn_lr_policy(step) - for param_group in optimizer.param_groups: - param_group['lr'] = adjusted_lr - optimizer.zero_grad() - last_iter_start = time.time() - - t_audio_signal_t, t_a_sig_length_t, t_transcript_t, t_transcript_len_t = tensors - model.train() - if optim_level == 1: - with amp.disable_casts(): - t_processed_signal_t, t_processed_sig_length_t = audio_preprocessor(t_audio_signal_t, t_a_sig_length_t) + for batch in val_loader: + if use_dali: + # with DALI, the data is already on GPU + feat, feat_lens, txt, txt_lens = batch + if val_feat_proc is not None: + feat, feat_lens = val_feat_proc(feat, feat_lens, use_amp) else: - t_processed_signal_t, t_processed_sig_length_t = audio_preprocessor(t_audio_signal_t, t_a_sig_length_t) - t_processed_signal_t = data_spectr_augmentation(t_processed_signal_t) - if jasper_encoder.use_conv_mask: - t_log_probs_t, t_encoded_len_t = model.forward((t_processed_signal_t, t_processed_sig_length_t)) - else: - t_log_probs_t = model.forward(t_processed_signal_t) + batch = [t.cuda(non_blocking=True) for t in batch] + audio, audio_lens, txt, txt_lens = batch + feat, feat_lens = val_feat_proc(audio, audio_lens, use_amp) - t_loss_t = ctc_loss(log_probs=t_log_probs_t, targets=t_transcript_t, input_length=t_encoded_len_t, target_length=t_transcript_len_t) - if args.gradient_accumulation_steps > 1: - t_loss_t = t_loss_t / args.gradient_accumulation_steps + log_probs, enc_lens = model.forward(feat, feat_lens) + loss = ctc_loss(log_probs, txt, enc_lens, txt_lens) + pred = greedy_decoder(log_probs) - if 0 < optim_level <= 3: - with amp.scale_loss(t_loss_t, optimizer) as scaled_loss: - scaled_loss.backward() - else: - t_loss_t.backward() - batch_counter += 1 - average_loss += t_loss_t.item() + agg['losses'] += helpers.gather_losses([loss]) + agg['preds'] += helpers.gather_predictions([pred], labels) + agg['txts'] += helpers.gather_transcripts([txt], [txt_lens], labels) - if batch_counter % args.gradient_accumulation_steps == 0: - optimizer.step() - - if step % args.train_frequency == 0: - t_predictions_t = greedy_decoder(log_probs=t_log_probs_t) - - e_tensors = [t_predictions_t, t_transcript_t, t_transcript_len_t] - train_wer = monitor_asr_train_progress(e_tensors, labels=labels) - print_once("Loss@Step: {0} ::::::: {1}".format(step, str(average_loss))) - print_once("Step time: {0} seconds".format(time.time() - last_iter_start)) - if step > 0 and step % args.eval_frequency == 0: - print_once("Doing Evaluation ....................... ...... ... .. . .") - eval(model) - if args.ema > 0: - eval(ema_model, 'EMA') - - step += 1 - batch_counter = 0 - average_loss = 0 - if args.num_steps is not None and step >= args.num_steps: - break - - if args.num_steps is not None and step >= args.num_steps: - break - print_once("Finished epoch {0} in {1}".format(epoch, time.time() - last_epoch_start)) - epoch += 1 - if epoch % args.save_frequency == 0 and epoch > 0: - save(model, ema_model, optimizer, epoch, args.output_dir, optim_level) - if args.num_steps is None and epoch >= args.num_epochs: - break - print_once("Done in {0}".format(time.time() - start_time)) - print_once("Final Evaluation ....................... ...... ... .. . .") - eval(model) - if args.ema > 0: - eval(ema_model, 'EMA') - save(model, ema_model, optimizer, epoch, args.output_dir, optim_level) + wer, loss = process_evaluation_epoch(agg) + log((epoch,), step, subset, {'loss': loss, 'wer': 100.0 * wer, + 'took': time.time() - start_time}) + model.train() + return wer -def main(args): - random.seed(args.seed) - np.random.seed(args.seed) - torch.manual_seed(args.seed) +def main(): + args = parse_args() + assert(torch.cuda.is_available()) - torch.backends.cudnn.benchmark = args.cudnn + assert args.prediction_frequency % args.log_frequency == 0 + + torch.backends.cudnn.benchmark = args.cudnn_benchmark # set up distributed training - if args.local_rank is not None: - torch.cuda.set_device(args.local_rank) - torch.distributed.init_process_group(backend='nccl', init_method='env://') - - - multi_gpu = torch.distributed.is_initialized() + multi_gpu = int(os.environ.get('WORLD_SIZE', 1)) > 1 if multi_gpu: - print_once("DISTRIBUTED TRAINING with {} gpus".format(torch.distributed.get_world_size())) + torch.cuda.set_device(args.local_rank) + dist.init_process_group(backend='nccl', init_method='env://') + world_size = dist.get_world_size() + print_once(f'Distributed training with {world_size} GPUs\n') + else: + world_size = 1 - # define amp optimiation level - optim_level = 1 if args.amp else 0 + torch.manual_seed(args.seed + args.local_rank) + np.random.seed(args.seed + args.local_rank) + random.seed(args.seed + args.local_rank) - jasper_model_definition = toml.load(args.model_toml) - dataset_vocab = jasper_model_definition['labels']['labels'] - ctc_vocab = add_ctc_labels(dataset_vocab) + init_log(args) - train_manifest = args.train_manifest - val_manifest = args.val_manifest - featurizer_config = jasper_model_definition['input'] - featurizer_config_eval = jasper_model_definition['input_eval'] - featurizer_config["optimization_level"] = optim_level - featurizer_config_eval["optimization_level"] = optim_level + cfg = config.load(args.model_config) + config.apply_duration_flags(cfg, args.max_duration, args.pad_to_max_duration) - sampler_type = featurizer_config.get("sampler", 'default') - perturb_config = jasper_model_definition.get('perturb', None) - if args.pad_to_max: - assert(args.max_duration > 0) - featurizer_config['max_duration'] = args.max_duration - featurizer_config_eval['max_duration'] = args.max_duration - featurizer_config['pad_to'] = -1 - featurizer_config_eval['pad_to'] = -1 + symbols = helpers.add_ctc_blank(cfg['labels']) - print_once('model_config') - print_dict(jasper_model_definition) + assert args.grad_accumulation_steps >= 1 + assert args.batch_size % args.grad_accumulation_steps == 0 + batch_size = args.batch_size // args.grad_accumulation_steps - if args.gradient_accumulation_steps < 1: - raise ValueError('Invalid gradient accumulation steps parameter {}'.format(args.gradient_accumulation_steps)) - if args.batch_size % args.gradient_accumulation_steps != 0: - raise ValueError('gradient accumulation step {} is not divisible by batch size {}'.format(args.gradient_accumulation_steps, args.batch_size)) + print_once('Setting up datasets...') + train_dataset_kw, train_features_kw = config.input(cfg, 'train') + val_dataset_kw, val_features_kw = config.input(cfg, 'val') + use_dali = args.dali_device in ('cpu', 'gpu') + if use_dali: + assert train_dataset_kw['ignore_offline_speed_perturbation'], \ + "DALI doesn't support offline speed perturbation" - data_layer = AudioToTextDataLayer( - dataset_dir=args.dataset_dir, - featurizer_config=featurizer_config, - perturb_config=perturb_config, - manifest_filepath=train_manifest, - labels=dataset_vocab, - batch_size=args.batch_size // args.gradient_accumulation_steps, - multi_gpu=multi_gpu, - pad_to_max=args.pad_to_max, - sampler=sampler_type) + # pad_to_max_duration is not supported by DALI - have simple padders + if train_features_kw['pad_to_max_duration']: + train_feat_proc = BaseFeatures( + pad_align=train_features_kw['pad_align'], + pad_to_max_duration=True, + max_duration=train_features_kw['max_duration'], + sample_rate=train_features_kw['sample_rate'], + window_size=train_features_kw['window_size'], + window_stride=train_features_kw['window_stride']) + train_features_kw['pad_to_max_duration'] = False + else: + train_feat_proc = None - data_layer_eval = AudioToTextDataLayer( - dataset_dir=args.dataset_dir, - featurizer_config=featurizer_config_eval, - manifest_filepath=val_manifest, - labels=dataset_vocab, - batch_size=args.batch_size, - multi_gpu=multi_gpu, - pad_to_max=args.pad_to_max - ) + if val_features_kw['pad_to_max_duration']: + val_feat_proc = BaseFeatures( + pad_align=val_features_kw['pad_align'], + pad_to_max_duration=True, + max_duration=val_features_kw['max_duration'], + sample_rate=val_features_kw['sample_rate'], + window_size=val_features_kw['window_size'], + window_stride=val_features_kw['window_stride']) + val_features_kw['pad_to_max_duration'] = False + else: + val_feat_proc = None - model = Jasper(feature_config=featurizer_config, jasper_model_definition=jasper_model_definition, feat_in=1024, num_classes=len(ctc_vocab)) + train_loader = DaliDataLoader(gpu_id=args.local_rank, + dataset_path=args.dataset_dir, + config_data=train_dataset_kw, + config_features=train_features_kw, + json_names=args.train_manifests, + batch_size=batch_size, + grad_accumulation_steps=args.grad_accumulation_steps, + pipeline_type="train", + device_type=args.dali_device, + symbols=symbols) - ctc_loss = CTCLossNM( num_classes=len(ctc_vocab)) + val_loader = DaliDataLoader(gpu_id=args.local_rank, + dataset_path=args.dataset_dir, + config_data=val_dataset_kw, + config_features=val_features_kw, + json_names=args.val_manifests, + batch_size=batch_size, + pipeline_type="val", + device_type=args.dali_device, + symbols=symbols) + else: + train_dataset_kw, train_features_kw = config.input(cfg, 'train') + train_dataset = AudioDataset(args.dataset_dir, + args.train_manifests, + symbols, + **train_dataset_kw) + train_loader = get_data_loader(train_dataset, + batch_size, + multi_gpu=multi_gpu, + shuffle=True, + num_workers=4) + train_feat_proc = FilterbankFeatures(**train_features_kw) + + val_dataset_kw, val_features_kw = config.input(cfg, 'val') + val_dataset = AudioDataset(args.dataset_dir, + args.val_manifests, + symbols, + **val_dataset_kw) + val_loader = get_data_loader(val_dataset, + batch_size, + multi_gpu=multi_gpu, + shuffle=False, + num_workers=4, + drop_last=False) + val_feat_proc = FilterbankFeatures(**val_features_kw) + + dur = train_dataset.duration / 3600 + dur_f = train_dataset.duration_filtered / 3600 + nsampl = len(train_dataset) + print_once(f'Training samples: {nsampl} ({dur:.1f}h, ' + f'filtered {dur_f:.1f}h)') + + if train_feat_proc is not None: + train_feat_proc.cuda() + if val_feat_proc is not None: + val_feat_proc.cuda() + + steps_per_epoch = len(train_loader) // args.grad_accumulation_steps + + # set up the model + model = Jasper(encoder_kw=config.encoder(cfg), + decoder_kw=config.decoder(cfg, n_classes=len(symbols))) + model.cuda() + ctc_loss = CTCLossNM(n_classes=len(symbols)) greedy_decoder = GreedyCTCDecoder() - print_once("Number of parameters in encoder: {0}".format(model.jasper_encoder.num_weights())) - print_once("Number of parameters in decode: {0}".format(model.jasper_decoder.num_weights())) + print_once(f'Model size: {num_weights(model) / 10**6:.1f}M params\n') - N = len(data_layer) - if sampler_type == 'default': - args.step_per_epoch = math.ceil(N / (args.batch_size * (1 if not torch.distributed.is_initialized() else torch.distributed.get_world_size()))) - elif sampler_type == 'bucket': - args.step_per_epoch = int(len(data_layer.sampler) / args.batch_size ) - - print_once('-----------------') - print_once('Have {0} examples to train on.'.format(N)) - print_once('Have {0} steps / (gpu * epoch).'.format(args.step_per_epoch)) - print_once('-----------------') - - fn_lr_policy = lambda s: lr_policy(args.lr, s, args.num_epochs * args.step_per_epoch) - - - model.cuda() - - if args.optimizer_kind == "novograd": - optimizer = Novograd(model.parameters(), - lr=args.lr, - weight_decay=args.weight_decay) - elif args.optimizer_kind == "adam": - optimizer = AdamW(model.parameters(), - lr=args.lr, - weight_decay=args.weight_decay) + # optimization + kw = {'lr': args.lr, 'weight_decay': args.weight_decay} + if args.optimizer == "novograd": + optimizer = Novograd(model.parameters(), **kw) + elif args.optimizer == "adamw": + optimizer = AdamW(model.parameters(), **kw) else: - raise ValueError("invalid optimizer choice: {}".format(args.optimizer_kind)) + raise ValueError(f'Invalid optimizer "{args.optimizer}"') - if 0 < optim_level <= 3: + adjust_lr = lambda step, epoch, optimizer: lr_policy( + step, epoch, args.lr, optimizer, steps_per_epoch=steps_per_epoch, + warmup_epochs=args.warmup_epochs, hold_epochs=args.hold_epochs, + num_epochs=args.epochs, policy=args.lr_policy, min_lr=args.min_lr, + exp_gamma=args.lr_exp_gamma) + + if args.amp: model, optimizer = amp.initialize( - min_loss_scale=1.0, - models=model, - optimizers=optimizer, - opt_level='O' + str(optim_level)) + min_loss_scale=1.0, models=model, optimizers=optimizer, + opt_level='O1', max_loss_scale=512.0) if args.ema > 0: ema_model = copy.deepcopy(model) else: ema_model = None - model = model_multi_gpu(model, multi_gpu) + if multi_gpu: + model = DistributedDataParallel(model) + + if args.pyprof: + pyprof.init(enable_function_stack=True) + + # load checkpoint + meta = {'best_wer': 10**6, 'start_epoch': 0} + checkpointer = Checkpointer(args.output_dir, 'Jasper', + args.keep_milestones, args.amp) + if args.resume: + args.ckpt = checkpointer.last_checkpoint() or args.ckpt if args.ckpt is not None: - print_once("loading model from {}".format(args.ckpt)) - checkpoint = torch.load(args.ckpt, map_location="cpu") - if hasattr(model, 'module'): - model.module.load_state_dict(checkpoint['state_dict'], strict=True) - else: - model.load_state_dict(checkpoint['state_dict'], strict=True) + checkpointer.load(args.ckpt, model, ema_model, optimizer, meta) - if args.ema > 0: - if 'ema_state_dict' in checkpoint: - if hasattr(ema_model, 'module'): - ema_model.module.load_state_dict(checkpoint['ema_state_dict'], strict=True) - else: - ema_model.load_state_dict(checkpoint['ema_state_dict'], strict=True) + start_epoch = meta['start_epoch'] + best_wer = meta['best_wer'] + epoch = 1 + step = start_epoch * steps_per_epoch + 1 + + if args.pyprof: + torch.autograd.profiler.emit_nvtx().__enter__() + profiler.start() + + # training loop + model.train() + + # pre-allocate + if args.pre_allocate_range is not None: + n_feats = train_features_kw['n_filt'] + pad_align = train_features_kw['pad_align'] + a, b = args.pre_allocate_range + for n_frames in range(a, b + pad_align, pad_align): + print_once(f'Pre-allocation ({batch_size}x{n_feats}x{n_frames})...') + + feat = torch.randn(batch_size, n_feats, n_frames, device='cuda') + feat_lens = torch.ones(batch_size, device='cuda').fill_(n_frames) + txt = torch.randint(high=len(symbols)-1, size=(batch_size, 100), + device='cuda') + txt_lens = torch.ones(batch_size, device='cuda').fill_(100) + log_probs, enc_lens = model(feat, feat_lens) + del feat + loss = ctc_loss(log_probs, txt, enc_lens, txt_lens) + loss.backward() + model.zero_grad() + + for epoch in range(start_epoch + 1, args.epochs + 1): + if multi_gpu and not use_dali: + train_loader.sampler.set_epoch(epoch) + + epoch_utts = 0 + accumulated_batches = 0 + epoch_start_time = time.time() + + for batch in train_loader: + + if accumulated_batches == 0: + adjust_lr(step, epoch, optimizer) + optimizer.zero_grad() + step_loss = 0 + step_utts = 0 + step_start_time = time.time() + + if use_dali: + # with DALI, the data is already on GPU + feat, feat_lens, txt, txt_lens = batch + if train_feat_proc is not None: + feat, feat_lens = train_feat_proc(feat, feat_lens, args.amp) else: - print_once('WARNING: ema_state_dict not found in the checkpoint') - print_once('WARNING: initializing EMA model with regular params') - if hasattr(ema_model, 'module'): - ema_model.module.load_state_dict(checkpoint['state_dict'], strict=True) + batch = [t.cuda(non_blocking=True) for t in batch] + audio, audio_lens, txt, txt_lens = batch + feat, feat_lens = train_feat_proc(audio, audio_lens, args.amp) + + log_probs, enc_lens = model(feat, feat_lens) + + loss = ctc_loss(log_probs, txt, enc_lens, txt_lens) + loss /= args.grad_accumulation_steps + + if torch.isnan(loss).any(): + print_once(f'WARNING: loss is NaN; skipping update') + else: + if multi_gpu: + step_loss += reduce_tensor(loss.data, world_size).item() else: - ema_model.load_state_dict(checkpoint['state_dict'], strict=True) + step_loss += loss.item() - optimizer.load_state_dict(checkpoint['optimizer']) + if args.amp: + with amp.scale_loss(loss, optimizer) as scaled_loss: + scaled_loss.backward() + else: + loss.backward() + step_utts += batch[0].size(0) * world_size + epoch_utts += batch[0].size(0) * world_size + accumulated_batches += 1 - if optim_level > 0: - amp.load_state_dict(checkpoint['amp']) + if accumulated_batches % args.grad_accumulation_steps == 0: + optimizer.step() + apply_ema(model, ema_model, args.ema) - args.start_epoch = checkpoint['epoch'] - else: - args.start_epoch = 0 + if step % args.log_frequency == 0: + preds = greedy_decoder(log_probs) + wer, pred_utt, ref = greedy_wer(preds, txt, txt_lens, symbols) - train(data_layer, data_layer_eval, model, ema_model, - ctc_loss=ctc_loss, \ - greedy_decoder=greedy_decoder, \ - optimizer=optimizer, \ - labels=ctc_vocab, \ - optim_level=optim_level, \ - multi_gpu=multi_gpu, \ - fn_lr_policy=fn_lr_policy if args.lr_decay else None, \ - args=args) + if step % args.prediction_frequency == 0: + print_once(f' Decoded: {pred_utt[:90]}') + print_once(f' Reference: {ref[:90]}') + + step_time = time.time() - step_start_time + log((epoch, step % steps_per_epoch or steps_per_epoch, steps_per_epoch), + step, 'train', + {'loss': step_loss, + 'wer': 100.0 * wer, + 'throughput': step_utts / step_time, + 'took': step_time, + 'lrate': optimizer.param_groups[0]['lr']}) + + step_start_time = time.time() + + if step % args.eval_frequency == 0: + wer = evaluate(epoch, step, val_loader, val_feat_proc, + symbols, model, ema_model, ctc_loss, + greedy_decoder, args.amp, use_dali) + + if wer < best_wer and epoch >= args.save_best_from: + checkpointer.save(model, ema_model, optimizer, epoch, + step, best_wer, is_best=True) + best_wer = wer + + step += 1 + accumulated_batches = 0 + # end of step + + # DALI iterator need to be exhausted; + # if not using DALI, simulate drop_last=True with grad accumulation + if not use_dali and step > steps_per_epoch * epoch: + break + + epoch_time = time.time() - epoch_start_time + log((epoch,), None, 'train_avg', {'throughput': epoch_utts / epoch_time, + 'took': epoch_time}) + + if epoch % args.save_frequency == 0 or epoch in args.keep_milestones: + checkpointer.save(model, ema_model, optimizer, epoch, step, best_wer) + + if 0 < args.epochs_this_job <= epoch - start_epoch: + print_once(f'Finished after {args.epochs_this_job} epochs.') + break + # end of epoch + + if args.pyprof: + profiler.stop() + torch.autograd.profiler.emit_nvtx().__exit__(None, None, None) + + log((), None, 'train_avg', {'throughput': epoch_utts / epoch_time}) + + if epoch == args.epochs: + evaluate(epoch, step, val_loader, val_feat_proc, symbols, model, + ema_model, ctc_loss, greedy_decoder, args.amp, use_dali) + + checkpointer.save(model, ema_model, optimizer, epoch, step, best_wer) + flush_log() -def parse_args(): - parser = argparse.ArgumentParser(description='Jasper') - parser.add_argument("--local_rank", default=None, type=int) - parser.add_argument("--batch_size", default=16, type=int, help='data batch size') - parser.add_argument("--num_epochs", default=10, type=int, help='number of training epochs. if number of steps if specified will overwrite this') - parser.add_argument("--num_steps", default=None, type=int, help='if specified overwrites num_epochs and will only train for this number of iterations') - parser.add_argument("--save_freq", dest="save_frequency", default=300, type=int, help='number of epochs until saving checkpoint. will save at the end of training too.') - parser.add_argument("--eval_freq", dest="eval_frequency", default=200, type=int, help='number of iterations until doing evaluation on full dataset') - parser.add_argument("--train_freq", dest="train_frequency", default=25, type=int, help='number of iterations until printing training statistics on the past iteration') - parser.add_argument("--lr", default=1e-3, type=float, help='learning rate') - parser.add_argument("--weight_decay", default=1e-3, type=float, help='weight decay rate') - parser.add_argument("--train_manifest", type=str, required=True, help='relative path given dataset folder of training manifest file') - parser.add_argument("--model_toml", type=str, required=True, help='relative path given dataset folder of model configuration file') - parser.add_argument("--val_manifest", type=str, required=True, help='relative path given dataset folder of evaluation manifest file') - parser.add_argument("--max_duration", type=float, help='maximum duration of audio samples for training and evaluation') - parser.add_argument("--pad_to_max", action="store_true", default=False, help="pad sequence to max_duration") - parser.add_argument("--gradient_accumulation_steps", default=1, type=int, help='number of accumulation steps') - parser.add_argument("--optimizer", dest="optimizer_kind", default="novograd", type=str, help='optimizer') - parser.add_argument("--dataset_dir", dest="dataset_dir", required=True, type=str, help='root dir of dataset') - parser.add_argument("--lr_decay", action="store_true", default=False, help='use learning rate decay') - parser.add_argument("--cudnn", action="store_true", default=False, help="enable cudnn benchmark") - parser.add_argument("--amp", "--fp16", action="store_true", default=False, help="use mixed precision training") - parser.add_argument("--output_dir", type=str, required=True, help='saves results in this directory') - parser.add_argument("--ckpt", default=None, type=str, help="if specified continues training from given checkpoint. Otherwise starts from beginning") - parser.add_argument("--seed", default=42, type=int, help='seed') - parser.add_argument("--ema", type=float, default=0.0, help='discount factor for exponential averaging of model weights during training') - args=parser.parse_args() - return args - - -if __name__=="__main__": - args = parse_args() - print_dict(vars(args)) - main(args) +if __name__ == "__main__": + main() diff --git a/PyTorch/SpeechRecognition/Jasper/triton/Dockerfile b/PyTorch/SpeechRecognition/Jasper/triton/Dockerfile index 01bdfcd7..9fda344c 100644 --- a/PyTorch/SpeechRecognition/Jasper/triton/Dockerfile +++ b/PyTorch/SpeechRecognition/Jasper/triton/Dockerfile @@ -1,38 +1,10 @@ -ARG FROM_IMAGE_NAME=nvcr.io/nvidia/pytorch:19.09-py3 - -FROM tensorrtserver_client as trtis-client +ARG FROM_IMAGE_NAME=nvcr.io/nvidia/tritonserver:20.10-py3-clientsdk FROM ${FROM_IMAGE_NAME} -RUN apt-get update && apt-get install -y python3 -ARG version=6.0.1-1+cuda10.1 -RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb \ -&& dpkg -i cuda-repo-*.deb \ -&& wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb \ -&& dpkg -i nvidia-machine-learning-repo-*.deb \ -&& apt-get update \ -&& apt-get install -y --no-install-recommends libnvinfer6=${version} libnvonnxparsers6=${version} libnvparsers6=${version} libnvinfer-plugin6=${version} libnvinfer-dev=${version} libnvonnxparsers-dev=${version} libnvparsers-dev=${version} libnvinfer-plugin-dev=${version} python-libnvinfer=${version} python3-libnvinfer=${version} -RUN cp -r /usr/lib/python3.6/dist-packages/tensorrt /opt/conda/lib/python3.6/site-packages/tensorrt +RUN apt update && apt install -y python3-pyaudio libsndfile1 -ENV PATH=$PATH:/usr/src/tensorrt/bin -WORKDIR /tmp/onnx-trt -COPY tensorrt/onnx-trt.patch . -RUN git clone https://github.com/onnx/onnx-tensorrt.git && cd onnx-tensorrt && git checkout b677b9cbf19af803fa6f76d05ce558e657e4d8b6 && git submodule update --init --recursive && \ - patch -f < ../onnx-trt.patch && mkdir build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr -DGPU_ARCHS="60 70 75" && make -j16 && make install && mv -f /usr/lib/libnvonnx* /usr/lib/x86_64-linux-gnu/ && ldconfig - - -# Here's a good place to install pip reqs from JoC repo. -# At the same step, also install TRT pip reqs -WORKDIR /tmp/pipReqs -COPY requirements.txt /tmp/pipReqs/pytRequirements.txt -COPY tensorrt/requirements.txt /tmp/pipReqs/trtRequirements.txt -COPY triton/requirements.txt /tmp/pipReqs/trtisRequirements.txt -RUN apt-get update && apt-get install -y --no-install-recommends portaudio19-dev && pip install -r pytRequirements.txt && pip install -r trtRequirements.txt && pip install -r trtisRequirements.txt - -#Copy the perf_client over -COPY --from=trtis-client /workspace/install/bin/perf_client /workspace/install/bin/perf_client -#Copy the python wheel and install with pip -COPY --from=trtis-client /workspace/install/python/tensorrtserver*.whl /tmp/ -RUN pip install /tmp/tensorrtserver*.whl && rm /tmp/tensorrtserver*.whl +RUN pip3 install -U pip +RUN pip3 install onnxruntime unidecode inflect soundfile WORKDIR /workspace/jasper COPY . . diff --git a/PyTorch/SpeechRecognition/Jasper/triton/README.md b/PyTorch/SpeechRecognition/Jasper/triton/README.md index 143d06ee..0b185e2f 100644 --- a/PyTorch/SpeechRecognition/Jasper/triton/README.md +++ b/PyTorch/SpeechRecognition/Jasper/triton/README.md @@ -1,55 +1,37 @@ -# Jasper Inference Using Triton Inference Server +# Deploying the Jasper Inference model using Triton Inference Server -This is a subfolder of the Jasper for PyTorch repository that provides scripts to deploy high-performance inference using NVIDIA Triton Inference Server (formerly NVIDIA TensorRT Inference Server). It offers different options for the inference model pipeline. +This subfolder of the Jasper for PyTorch repository contains scripts for deployment of high-performance inference on NVIDIA Triton Inference Server as well as detailed performance analysis. It offers different options for the inference model pipeline. ## Table Of Contents - -- [Model overview](#model-overview) - * [Model architecture](#model-architecture) - * [Triton Inference Server Overview](#triton-inference-server-overview) - * [Inference Pipeline in Triton Inference Server](#inference-pipeline-in-triton-inference-server) +- [Solution overview](#solution-overview) +- [Inference Pipeline in Triton Inference Server](#inference-pipeline-in-triton-inference-server) - [Setup](#setup) - * [Supported Software](#supported-software) - * [Requirements](#requirements) - [Quick Start Guide](#quick-start-guide) - [Advanced](#advanced) - * [Scripts and sample code](#scripts-and-sample-code) + * [Scripts and sample code](#scripts-and-sample-code) - [Performance](#performance) - * [Inference Benchmarking in Triton Inference Server](#inference-benchmarking-in-triton-inference-server) - * [Results](#results) - * [Performance analysis for Triton Inference Server: NVIDIA T4](#performance-analysis-for-triton-inference-server-nvidia-t4) - * [Maximum Batch Size](#maximum-batch-size) - * [Batching techniques: Static versus Dynamic Batching](#batching-techniques-static-versus-dynamic-batching) - * [TensorRT/ONNX/PyTorch JIT comparisons](#tensorrt/onnx/pytorch-jit-comparisons) - * [Throughput Comparison](#throughput-comparison) - * [Latency Comparison](#latency-comparison) - -## Model overview - -### Model architecture + * [Inference Benchmarking in Triton Inference Server](#inference-benchmarking-in-triton-inference-server) + * [Results](#results) + * [Performance Analysis for Triton Inference Server: NVIDIA T4 +](#performance-analysis-for-triton-inference-server-nvidia-t4) + * [Maximum batch size](#maximum-batch-size) + * [Batching techniques: Static versus Dynamic Batching](#batching-techniques-static-versus-dynamic) + * [TensorRT, ONNX, and PyTorch JIT comparisons](#tensorrt-onnx-and-pytorch-jit-comparisons) +- [Release Notes](#release-notes) + * [Changelog](#change-log) + * [Known issues](#known-issues) -Jasper is a neural acoustic model for speech recognition. Its network architecture is designed to facilitate fast GPU inference. More information about Jasper and its training and be found in the [Jasper PyTorch README](../README.md). -By default the model configuration is Jasper 10x5 with dense residuals. A Jasper BxR model has B blocks, each consisting of R repeating sub-blocks. -Each sub-block applies the following operations in sequence: 1D-Convolution, Batch Normalization, ReLU activation, and Dropout. - -In the original paper Jasper is trained with masked convolutions, which masks out the padded part of an input sequence in a batch before the 1D-Convolution. -For inference masking is not used. The reason for this is that in inference, the original mask operation does not achieve better accuracy than without the mask operation on the test and development dataset. However, no masking achieves better inference performance especially after TensorRT optimization. - -More information on the Jasper model architecture can be found in the [Jasper PyTorch README](../README.md). - - - - -### Triton Inference Server Overview +## Solution Overview The [NVIDIA Triton Inference Server](https://github.com/NVIDIA/triton-inference-server) provides a datacenter and cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any number of GPU or CPU models being managed by the server. + This folder contains detailed performance analysis as well as scripts to run Jasper inference using Triton Inference Server. A typical Triton Inference Server pipeline can be broken down into the following steps: -1. The client serializes the inference request into a message and sends it to the server (Client Send). +1. The client serializes the inference request into a message and sends it to the server (Client Send). 2. The message travels over the network from the client to the server (Network). 3. The message arrives at the server, and is deserialized (Server Receive). 4. The request is placed on the queue (Server Queue). @@ -58,15 +40,16 @@ A typical Triton Inference Server pipeline can be broken down into the following 7. The completed message then travels over the network from the server to the client (Network). 8. The completed message is deserialized by the client and processed as a completed inference request (Client Receive). -Generally, for local clients, steps 1-4 and 6-8 will only occupy a small fraction of time, compared to steps 5-6. As backend deep learning systems like Jasper are rarely exposed directly to end users, but instead only interfacing with local front-end servers, for the sake of Jasper, we can consider that all clients are local. +Generally, for local clients, steps 1-4 and 6-8 will only occupy a small fraction of time, compared to step 5. As backend deep learning systems like Jasper are rarely exposed directly to end users, but instead only interfacing with local front-end servers, for the sake of Jasper, we can consider that all clients are local. + In this section, we will go over how to launch both the Triton Inference Server and the client and get the best performance solution that fits your specific application needs. -Note: The following instructions are run from outside the container and call `docker run` commands as required. +More information on how to perform inference using NVIDIA Triton Inference Server can be found in [triton/README.md](https://github.com/triton-inference-server/server/blob/master/README.md). ## Inference Pipeline in Triton Inference Server -The Jasper model pipeline consists of 3 components, where each part can be customized to be a different backend: +The Jasper model pipeline consists of 3 components, where each part can be customized to be a different backend: **Data preprocessor** @@ -74,11 +57,11 @@ The data processor transforms an input raw audio file into a spectrogram. By def **Acoustic model** -The acoustic model takes in the spectrogram and outputs a probability over a list of characters. This part is the most compute intensive, taking more than 90% of the entire end-to-end pipeline. The acoustic model is the only component with learnable parameters and what differentiates Jasper from other end-to-end neural speech recognition models. In the original paper, the acoustic model contains a masking operation for training (More details in [../README.md]). We do not use masking for inference . +The acoustic model takes in the spectrogram and outputs a probability over a list of characters. This part is the most compute intensive, taking more than 90% of the entire end-to-end pipeline. The acoustic model is the only component with learnable parameters and what differentiates Jasper from other end-to-end neural speech recognition models. In the original paper, the acoustic model contains a masking operation for training (More details in [Jasper PyTorch README](../README.md)). We do not use masking for inference. **Greedy decoder** -The decoder takes the probabilities over the list of characters and outputs the final transcription. Greedy decoding is a fast and simple way of doing this by always choosing the character with the maximum probability. +The decoder takes the probabilities over the list of characters and outputs the final transcription. Greedy decoding is a fast and simple way of doing this by always choosing the character with the maximum probability. To run a model with TensorRT, we first construct the model in PyTorch, which is then exported into a ONNX static graph. Finally, a TensorRT engine is constructed from the ONNX file and can be launched to do inference. The following table shows which backends are supported for each part along the model pipeline. @@ -93,37 +76,24 @@ In order to run inference with TensorRT outside of the inference server, refer t ## Setup -### Supported Software +The repository contains a folder `./triton` with a `Dockerfile` which extends the PyTorch 20.10-py3 NGC container and encapsulates some dependencies. Ensure you have the following components: -The following software version configuration is supported has been tested. +- [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker) +- [PyTorch 20.10-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) +- [Triton Inference Server 20.10 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver) +- Access to [NVIDIA machine learning repository](https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb) and [NVIDIA CUDA repository](https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb) for NVIDIA TensorRT 6 +- Supported GPUs: + - [NVIDIA Volta architecture](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/) + - [NVIDIA Turing architecture](https://www.nvidia.com/en-us/geforce/turing/) + - [NVIDIA Ampere architecture](https://www.nvidia.com/en-us/data-center/nvidia-ampere-gpu-architecture/) +- [Pretrained Jasper Model Checkpoint](https://ngc.nvidia.com/catalog/models/nvidia:jasper_pyt_ckpt_amp) -|Software|Version| -|--------|-------| -|Python|3.6.9| -|PyTorch|1.2.0| -|TensorRT|6.0.1.5| -|CUDA|10.1.243| - - -The following section lists the requirements in order to start inference with Jasper in Triton Inference Server. - -### Requirements - -The repository contains a folder `./trtis/` with a `Dockerfile` which extends the PyTorch 19.09-py3 NGC container and encapsulates some dependencies. Ensure you have the following components: - -* [NVIDIA Docker](https://github.com/NVIDIA/nvidia-docker) -* [PyTorch 20.06-py3 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) -* [Triton Inference Server 20.06 NGC container](https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver) -* Access to [NVIDIA machine learning repository](https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb) and [NVIDIA cuda repository](https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb) for NVIDIA TensorRT 6 -* [NVIDIA Volta](https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/) or [Turing](https://www.nvidia.com/en-us/geforce/turing/) based GPU -* [Pretrained Jasper Model Checkpoint](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16) - -Required Python packages are listed in `requirements.txt`, `trt/requirements.txt` and `trtis/requirements.txt`. These packages are automatically installed when the Docker container is built. +Required Python packages are listed in `requirements.txt`. These packages are automatically installed when the Docker container is built. ## Quick Start Guide -Running the following scripts will build and launch the container containing all required dependencies for both TensorRT 6 as well as native PyTorch. This is necessary for using inference with TensorRT and can also be used for data download, processing and training of the model. +Running the following scripts will build and launch the container containing all required dependencies for native PyTorch as well as Triton. This is necessary for using inference and can also be used for data download, processing, and training of the model. For more information on the scripts and arguments, refer to the [Advanced](#advanced) section. 1. Clone the repository. @@ -132,114 +102,103 @@ Running the following scripts will build and launch the container containing all cd DeepLearningExamples/PyTorch/SpeechRecognition/Jasper ``` -2. Build a container that extends NGC PyTorch 19.09, TensorRT, Triton Inference Server, and Triton Inference Client: +2. Build the Jasper PyTorch container. + + Running the following scripts will build the container which contains all the required dependencies for data download and processing as well as converting the model. ```bash - bash trtis/scripts/docker/build.sh + bash scripts/docker/build.sh ``` 3. Start an interactive session in the Docker container: ```bash - export DATA_DIR= - export CHECKPOINT_DIR= - export RESULT_DIR= - bash trtis/scripts/docker/launch.sh + bash scripts/docker/launch.sh ``` - Where , and can be either empty or absolute directory paths to dataset, existing checkpoints or potential output files. - - Alternatively, to start a script `foo.sh` in the Docker container without an interactive session, run: + Where , and can be either empty or absolute directory paths to dataset, existing checkpoints or potential output files. When left empty, they default to `datasets/`, `/checkpoints`, and `results/`, respectively. The `/datasets`, `/checkpoints`, `/results` directories will be mounted as volumes and mapped to the corresponding directories ``, ``, `` on the host. + + Note that ``, ``, and `` directly correspond to the same arguments in `scripts/docker/launch.sh` and `trt/scripts/docker/launch.sh` mentioned in the [Jasper PyTorch README](../README.md) and [Jasper TensorRT README](../tensorrt/README.md). + + Briefly, `` should contain, or be prepared to contain a `LibriSpeech` sub-directory (created in [Acquiring Dataset](../trt/README.md)), `` should contain a PyTorch model checkpoint (`*.pt`) file obtained through training described in [Jasper PyTorch README](../README.md), and `` should be prepared to contain converted model and logs. + +4. Downloading the `test-clean` part of `LibriSpeech` is required for model conversion. But it is not required for inference on Triton Inference Server, which can use a single .wav audio file. To download and preprocess LibriSpeech, run the following inside the container: + + ```bash + bash triton/scripts/download_triton_librispeech.sh + bash triton/scripts/preprocess_triton_librispeech.sh + ``` + +5. (Option 1) Convert pretrained PyTorch model checkpoint into Triton Inference Server compatible model backends. + + Inside the container, run: ```bash - export DATA_DIR= - export CHECKPOINT_DIR= - export RESULT_DIR= - export PROGRAM_PATH=foo.sh - bash trtis/scripts/docker/trtis.sh + export CHECKPOINT_PATH= + export CONVERT_PRECISIONS= + export CONVERTS= + bash triton/scripts/export_model.sh ``` - The `/datasets`, `/checkpoints`, `/results` directories will be mounted as volumes and mapped to the corresponding directories ``, ``, `` on the host. Note that ``, ``, and `` directly correspond to the same arguments in `scripts/docker/launch.sh` and `trt/scripts/docker/launch.sh` mentioned in the [Jasper PyTorch README](../README.md) and [Jasper TensorRT README](../tensorrt/README.md). + Where `` (`"/checkpoints/jasper_fp16.pt"`) is the absolute file path of the pretrained checkpoint, `` (`"fp16" "fp32"`) is the list of precisions used for conversion, and `` (`"feature-extractor" "decoder" "ts-trace" "onnx" "tensorrt"`) is the list of conversions to be applied. The feature extractor converts only to TorchScript trace module (`feature-extractor`), the decoder only to TorchScript script module (`decoder`), and the Jasper model can convert to TorchScript trace module (`ts-trace`), ONNX (`onnx`), or TensorRT (`tensorrt`). - Briefly, `` should contain, or be prepared to contain a `LibriSpeech` sub-directory (created in [Acquiring Dataset](../trt/README.md)), `` should contain a PyTorch model checkpoint (`*.pt`) file obtained through training described in [Jasper PyTorch README](../README.md), and `` should be prepared to contain timing results and logs. Downloading `LibriSpeech` is not required for Inference in Triton Inference Server on a single .wav audio file. To do inference and evaluation on LibriSpeech, download the dataset following the instructions in the [Jasper TensorRT README](../tensorrt/README.md) + A pretrained PyTorch model checkpoint for model conversion can be downloaded from the [NGC model repository](https://ngc.nvidia.com/catalog/models/nvidia:jasper_pyt_ckpt_amp). -4. Convert pretrained PyTorch model checkpoint into Triton Inference Server compatible model backends. + More details can be found in the [Advanced](#advanced) section under [Scripts and sample code](#scripts-and-sample-code). + +6. (Option 2) Download pre-exported inference checkpoints from NGC. + + Alternatively, you can skip the manual model export and download already generated model backends for every version of the model pipeline. + + * [Jasper_ONNX](https://ngc.nvidia.com/catalog/models/nvidia:jasper_pyt_onnx_fp16_amp/version), + * [Jasper_TorchScript](https://ngc.nvidia.com/catalog/models/nvidia:jasper_pyt_torchscript_fp16_amp/version), + * [Jasper_TensorRT_Turing](https://ngc.nvidia.com/catalog/models/nvidia:jasper_pyt_trt_fp16_amp_turing/version), + * [Jasper_TensorRT_Volta](https://ngc.nvidia.com/catalog/models/nvidia:jasper_pyt_trt_fp16_amp_volta/version). + + If you wish to use TensorRT pipeline, make sure to download the correct version for your hardware. The extracted model folder should contain 3 subfolders `feature-extractor-ts-trace`, `decoder-ts-script` and `jasper-x` where `x` can be `ts-trace`, `onnx`, `tensorrt` depending on the model backend. Copy the 3 model folders to the directory `./triton/model_repo/fp16` in your Jasper project. + +7. Build a container that extends Triton Inference Client: From outside the container, run: ```bash - export ARCH= - export CHECKPOINT_DIR= - export CHECKPOINT= - export PRECISION= - export MAX_SEQUENCE_LENGTH_FOR_ENGINE= - bash trtis/scripts/export_model.sh - bash trtis/scripts/prepare_model_repository.sh + bash triton/scripts/docker/build_triton_client.sh ``` - Where `` is either 70(Volta) or 75(Turing), `` is the absolute path that contains the pretrained checkpoint ``, and `` is either `fp16` or `fp32`. `` defines the maximum feasible audio length, where 100 corresponds to 1 second. - The exported models for deployment will be generated at `./trtis/deploy/`. +Once the above steps are completed you can either run inference benchmarks or perform inference on real data. - A pretrained PyTorch model checkpoint for model conversion can be downloaded from the [NGC model repository](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16). - - More details can be found in the [Advanced](#advanced) section under [Scripts and sample code](#scripts-and-sample-code). - -5. Download Pre-exported Inference Checkpoints from NGC - - If you would like to skip the manual model export, you can find already generated model backends in [https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_jit_fp16](https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_jit_fp16), [https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_onnx_fp16](https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_onnx_fp16), [https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_trt_turing_fp16](https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_trt_turing_fp16), [https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_trt_volta_fp16](https://ngc.nvidia.com/models/nvidian:swdl:jasperpyt_trt_volta_fp16). for every version of the model pipeline. If you wish to use TensorRT pipeline, make sure to download the correct version for your hardware. The extracted model folder should contain 3 subfolders `jasper-feature-extractor`, `jasper-decoder` and `jasper-x` where x can be pyt, onnx, trt depending on the model backend. You will find folders with the same name in your local Jasper repository under `trtis/model_repo/’. Copy the content of each of the 3 model folders to the according directory in your Jasper project, replace files with the same name. - - Then run: - ```bash - bash trtis/scripts/prepare_model_repository.sh - ``` - -6. Launch Triton Inference Server. - - Start the server: - ```bash - bash trtis/scripts/run_server.sh - ``` - -7. Run all inference benchmarks. +8. (Option 1) Run all inference benchmarks. From outside the container, run: ```bash - export ARCH= - export CHECKPOINT_DIR= export RESULT_DIR= - export CHECKPOINT= - bash trtis/scripts/execute_all_perf_runs.sh + export PRECISION_TESTS= + export BATCH_SIZES= + export SEQ_LENS= + bash triton/scripts/execute_all_perf_runs.sh ``` - Where `` is either 70(Volta) or 75(Turing), `` is the absolute path that contains the pretrained checkpoint ``, and `` is the absolute path to potential output files. + Where `` is the absolute path to potential output files (`./results`), `` is a list of precisions to be tested (`"fp16" "fp32"`), `` is a list of tested batch sizes (`"1" "2" "4" "8"`), and `` are tested sequnce lengths (`"32000" "112000" "267200"`). Note: This can take several hours to complete due to the extensiveness of the benchmark. More details about the benchmark are found in the [Advanced](#advanced) section under [Performance](#performance). -8. Run inference using the Client and Triton Inference Server. +9. (Option 2) Run inference on real data using the Client and Triton Inference Server. 8.1 From outside the container, restart the server: + ```bash - bash trtis/scripts/run_server.sh - ``` + bash triton/scripts/run_server.sh + ``` 8.2 From outside the container, submit the client request using: ```bash - bash trtis/scripts/run_client.sh + bash triton/scripts/run_client.sh ``` - Where `` can be either “pyt” (default), “trt” or “onnx”. `` is an absolute local path to the directory of files. is the relative path to to either an audio file in .wav format or a manifest file in .json format. + Where `` can be either "ts-trace", "tensorrt" or "onnx", `` is either "fp32" or "fp16". `` is an absolute local path to the directory of files. is the relative path to to either an audio file in .wav format or a manifest file in .json format. - Note: If is *.json should be the path to the LibriSpeech dataset. In this case this script will do both inference and evaluation on the accoring LibriSpeech dataset. - -9. Start Jupyter Notebook to run inference interactively. - - Run: - ```bash - jupyter notebook -- notebooks/JasperTRTIS.ipynb - ``` - - A pretrained model checkpoint necessary for using the jupyter notebook to be able to run inference can be downloaded from [NGC model repository](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16). + Note: If is *.json should be the path to the LibriSpeech dataset. In this case this script will do both inference and evaluation on the accoring LibriSpeech dataset. ## Advanced @@ -249,130 +208,177 @@ The following sections provide greater details about the Triton Inference Server ### Scripts and sample code -The `trtis/` directory contains the following files: +The `triton/` directory contains the following files: * `jasper-client.py`: Python client script that takes an audio file and a specific model pipeline type and submits a client request to the server to run inference with the model on the given audio file. -* `speech-utils.py`: helper functions for `jasper-client.py` +* `speech_utils.py`: helper functions for `jasper-client.py`. +* `converter.py`: Python script for model conversion to different backends. +* `jasper_module.py`: helper functions for `converter.py`. +* `model_repo_configs/`: directory with Triton model config files for different backend and precision configurations. -The `trtis/scripts/` directory has easy to use scripts to run supported functionalities, such as: -* `./docker/build.sh`: builds container -* `./docker/launch.sh`: launches container -* `execute_all_perf_runs.sh`: runs all benchmarks using TRTIS perfclient calls `generate_perf_results.sh` -* `export_model.sh`: from pretrained PyTorch checkpoint generates backends for every version of the model inference pipeline, calls `export_model_helper.sh` -* `prepare_model_repository.sh`: copies model config files from `./model_repo/` to `./deploy/model_rep`o and creates links to generated model backends, setting up the model repository for Triton Inference Server -* `generate_perf_results.sh`: runs benchmark with perf-client for specific configuration and calls `run_perf_client.sh` +The `triton/scripts/` directory has easy to use scripts to run supported functionalities, such as: +* `./docker/build_triton_client.sh`: builds container +* `execute_all_perf_runs.sh`: runs all benchmarks using Triton Inference Server performance client; calls `generate_perf_results.sh` +* `export_model.sh`: from pretrained PyTorch checkpoint generates backends for every version of the model inference pipeline. +* `prepare_model_repository.sh`: copies model config files from `./model_repo_configs/` to `./deploy/model_repo` and creates links to generated model backends, setting up the model repository for Triton Inference Server +* `generate_perf_results.sh`: runs benchmark with `perf-client` for specific configuration and calls `run_perf_client.sh` * `run_server.sh`: launches Triton Inference Server * `run_client.sh`: launches client by using `jasper-client.py` to submit inference requests to server +### Running the Triton Inference Server + +Launch the Triton Inference Server in detached mode to run in the background by default: + +```bash +bash triton/scripts/run_server.sh +``` + +To run in the foreground interactively, for debugging purposes, run: + +```bash +DAEMON="--detach=false" bash trinton/scripts/run_server.sh +``` + +The script mounts and loads models at `$PWD/triton/deploy/model_repo` to the server with all visible GPUs. In order to selectively choose the devices, set `NVIDIA_VISIBLE_DEVICES`. + + +### Running the Triton Inference Client + +*Real data* +In order to run the client with real data, run: + +```bash +bash triton/scripts/run_client.sh