2019-09-12 19:09:05 +02:00
# NVIDIA Deep Learning Examples for Tensor Cores
2018-05-03 00:47:16 +02:00
## Introduction
2020-07-27 16:41:03 +02:00
This repository provides State-of-the-Art Deep Learning examples that are easy to train and deploy, achieving the best reproducible accuracy and performance with NVIDIA CUDA-X software stack running on NVIDIA Volta, Turing and Ampere GPUs.
2018-05-03 00:47:16 +02:00
2018-05-03 02:40:35 +02:00
## NVIDIA GPU Cloud (NGC) Container Registry
These examples, along with our NVIDIA deep learning software stack, are provided in a monthly updated Docker container on the NGC container registry (https://ngc.nvidia.com). These containers include:
2018-05-03 00:47:16 +02:00
2018-05-05 02:19:50 +02:00
- The latest NVIDIA examples from this repository
2018-05-03 02:40:35 +02:00
- The latest NVIDIA contributions shared upstream to the respective framework
- The latest NVIDIA Deep Learning software libraries, such as cuDNN, NCCL, cuBLAS, etc. which have all been through a rigorous monthly quality assurance process to ensure that they provide the best possible performance
- [Monthly release notes ](https://docs.nvidia.com/deeplearning/dgx/index.html#nvidia-optimized-frameworks-release-notes ) for each of the NVIDIA optimized containers
2018-05-03 00:47:16 +02:00
2020-06-27 10:19:40 +02:00
## Computer Vision
2020-12-14 16:23:50 +01:00
| Models | Framework | A100 | AMP | Multi-GPU | Multi-Node | TRT | ONNX | Triton | DLC | NB |
2020-06-27 10:19:40 +02:00
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |------------- |------------- |------------- |------------- |------------- |
2020-12-14 16:23:50 +01:00
| [ResNet-50 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/resnet50v1.5 ) |PyTorch | Yes | Yes | Yes | - | Yes | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/triton/resnet50 ) | Yes | - |
| [ResNeXt-101 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/resnext101-32x4d ) |PyTorch | Yes | Yes | Yes | - | Yes | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/triton/resnext101-32x4d ) | Yes | - |
| [SE-ResNeXt-101 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/se-resnext101-32x4d ) |PyTorch | Yes | Yes | Yes | - | Yes | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/triton/se-resnext101-32x4d ) | Yes | - |
2021-05-26 13:03:13 +02:00
| [EfficientNet-B0 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/efficientnet ) |PyTorch | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [EfficientNet-B4 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/efficientnet ) |PyTorch | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [EfficientNet-WideSE-B0 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/efficientnet ) |PyTorch | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [EfficientNet-WideSE-B4 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/efficientnet ) |PyTorch | Yes | Yes | Yes | - | - | - | - | Yes | - |
2020-07-27 16:41:03 +02:00
| [Mask R-CNN ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Segmentation/MaskRCNN ) |PyTorch | Yes | Yes | Yes | - | - | - | - | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Segmentation/MaskRCNN/pytorch/notebooks/pytorch_MaskRCNN_pyt_train_and_inference.ipynb ) |
2021-03-19 10:19:30 +01:00
| [nnUNet ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Segmentation/nnUNet ) |PyTorch | Yes | Yes | Yes | - | - | - | - | Yes | - |
2020-07-27 16:41:03 +02:00
| [SSD ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Detection/SSD ) |PyTorch | Yes | Yes | Yes | - | - | - | - | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Detection/SSD/examples/inference.ipynb ) |
2020-12-14 16:23:50 +01:00
| [ResNet-50 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/ConvNets/resnet50v1.5 ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [ResNeXt101 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/ConvNets/resnext101-32x4d ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [SE-ResNeXt-101 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/ConvNets/se-resnext101-32x4d ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [Mask R-CNN ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Segmentation/MaskRCNN ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [SSD ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Detection/SSD ) | TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/blob/master/TensorFlow/Detection/SSD/models/research/object_detection/object_detection_tutorial.ipynb ) |
2020-11-02 05:35:30 +01:00
| [U-Net Ind ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Segmentation/UNet_Industrial ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Segmentation/UNet_Industrial/notebooks ) |
2020-12-14 16:23:50 +01:00
| [U-Net Med ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Segmentation/UNet_Medical ) | TensorFlow | Yes | Yes | Yes | - | - |- | - | Yes | - |
| [U-Net 3D ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Segmentation/UNet_3D_Medical ) | TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [V-Net Med ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Segmentation/VNet ) | TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
| [U-Net Med ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Segmentation/UNet_Medical ) | TensorFlow2 | Yes | Yes | Yes | - | - |- | - | Yes | - |
| [Mask R-CNN ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Segmentation/MaskRCNN ) |TensorFlow2 | Yes | Yes | Yes | - | - |- | - | Yes | - |
2021-04-13 10:45:25 +02:00
| [EfficientNet ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Classification/ConvNets/efficientnet ) |TensorFlow2 | Yes | Yes | Yes | Yes | - |- | - | Yes | - |
2020-07-27 16:41:03 +02:00
| [ResNet-50 ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/MxNet/Classification/RN50v1.5 ) | MXNet | - | Yes | Yes | - | - | - | - | - | - |
2020-06-27 10:19:40 +02:00
## Natural Language Processing
2020-12-14 16:23:50 +01:00
| Models | Framework | A100 | AMP | Multi-GPU | Multi-Node | TRT | ONNX | Triton | DLC | NB |
2020-06-27 10:19:40 +02:00
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |------------- |------------- |------------- |------------- |------------- |
2020-12-14 16:23:50 +01:00
| [BERT ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/LanguageModeling/BERT ) |PyTorch | Yes | Yes | Yes | Yes | - | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/LanguageModeling/BERT/triton ) | Yes | - |
| [TransformerXL ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/LanguageModeling/Transformer-XL ) |PyTorch | Yes | Yes | Yes | Yes | - | - | - | Yes | - |
2020-07-27 16:41:03 +02:00
| [GNMT ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Translation/GNMT ) |PyTorch | Yes | Yes | Yes | - | - | - | - | - | - |
| [Transformer ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Translation/Transformer ) |PyTorch | Yes | Yes | Yes | - | - | - | - | - | - |
2020-12-14 16:23:50 +01:00
| [ELECTRA ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/LanguageModeling/ELECTRA ) | TensorFlow2 | Yes | Yes | Yes | Yes | - | - | - | Yes | - |
| [BERT ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT ) |TensorFlow | Yes | Yes | Yes | Yes | Yes | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT/triton ) | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT/notebooks ) |
2021-04-21 14:34:48 +02:00
| [BERT ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/LanguageModeling/BERT ) |TensorFlow2 | Yes | Yes | Yes | Yes | - | - | - | Yes | - |
2020-12-14 16:23:50 +01:00
| [BioBert ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT/biobert ) | TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/blob/master/TensorFlow/LanguageModeling/BERT/notebooks/biobert_ner_tf_inference.ipynb ) |
2020-07-27 16:41:03 +02:00
| [TransformerXL ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/Transformer-XL ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | - | - |
| [GNMT ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Translation/GNMT ) | TensorFlow | Yes | Yes | Yes | - | - | - | - | - | - |
| [Faster Transformer ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/FasterTransformer ) | Tensorflow | - | - | - | - | Yes | - | - | - | - |
2020-06-27 10:19:40 +02:00
## Recommender Systems
2020-12-14 16:23:50 +01:00
| Models | Framework | A100 | AMP | Multi-GPU | Multi-Node | TRT | ONNX | Triton | DLC | NB |
2020-06-27 10:19:40 +02:00
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |------------- |------------- |------------- |------------- |------------- |
2020-12-14 16:23:50 +01:00
| [DLRM ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Recommendation/DLRM ) |PyTorch | Yes | Yes | Yes | - | - | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Recommendation/DLRM/triton ) | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Recommendation/DLRM/notebooks ) |
2021-03-25 18:54:37 +01:00
| [DLRM ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Recommendation/DLRM ) | TensorFlow2 | Yes | Yes | Yes | Yes | - | - | - | Yes | - |
2020-07-27 16:41:03 +02:00
| [NCF ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Recommendation/NCF ) |PyTorch | Yes | Yes | Yes | - | - |- | - | - | - |
2020-12-14 16:23:50 +01:00
| [Wide&Deep ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/WideAndDeep ) | TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
2021-03-05 11:37:32 +01:00
| [Wide&Deep ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Recommendation/WideAndDeep ) | TensorFlow2 | Yes | Yes | Yes | - | - | - | - | Yes | - |
2020-12-14 16:23:50 +01:00
| [NCF ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/NCF ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | Yes | - |
2020-07-27 16:41:03 +02:00
| [VAE-CF ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/VAE-CF ) |TensorFlow | Yes | Yes | Yes | - | - | - | - | - | - |
2020-06-27 10:19:40 +02:00
## Speech to Text
2020-12-14 16:23:50 +01:00
| Models | Framework | A100 | AMP | Multi-GPU | Multi-Node | TRT | ONNX | Triton | DLC | NB |
2020-06-27 10:19:40 +02:00
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |------------- |------------- |------------- |------------- |------------- |
2020-12-14 16:23:50 +01:00
| [Jasper ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper ) |PyTorch | Yes | Yes | Yes | - | Yes | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper/trtis ) | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/Jasper/notebooks ) |
2020-10-15 06:57:22 +02:00
| [Hidden Markov Model ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/Kaldi/SpeechRecognition ) | Kaldi | - | - | Yes | - | - | - | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/Kaldi/SpeechRecognition ) | - | - |
2020-06-27 10:19:40 +02:00
## Text to Speech
2020-12-14 16:23:50 +01:00
| Models | Framework | A100 | AMP | Multi-GPU | Multi-Node | TRT | ONNX | Triton | DLC | NB |
2020-06-27 10:19:40 +02:00
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |------------- |------------- |------------- |------------- |------------- |
2020-12-14 16:23:50 +01:00
| [FastPitch ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/FastPitch ) | PyTorch | Yes | Yes | Yes | - | - | - | - | Yes | - |
2020-10-15 06:57:22 +02:00
| [FastSpeech ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/CUDA-Optimized/FastSpeech ) | PyTorch | - | Yes | Yes | - | Yes | - | - | - | - |
2020-12-14 16:23:50 +01:00
| [Tacotron 2 and WaveGlow ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2 ) | PyTorch | Yes | Yes | Yes | - | Yes | Yes | [Yes ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp ) | Yes | - |
2020-06-27 10:19:40 +02:00
2021-08-25 08:54:21 +02:00
## Graph Neural Networks
| Models | Framework | A100 | AMP | Multi-GPU | Multi-Node | TRT | ONNX | Triton | DLC | NB |
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |------------- |------------- |------------- |------------- |------------- |
2021-11-01 17:49:17 +01:00
| [SE(3)-Transformer ](https://github.com/NVIDIA/DeepLearningExamples/tree/master/DGLPyTorch/DrugDiscovery/SE3Transformer ) | PyTorch | Yes | Yes | Yes | - | - | - | - | - | - |
2021-08-25 08:54:21 +02:00
2020-03-02 14:56:14 +01:00
2018-05-03 00:47:16 +02:00
## NVIDIA support
In each of the network READMEs, we indicate the level of support that will be provided. The range is from ongoing updates and improvements to a point-in-time release for thought leadership.
2020-12-14 16:33:43 +01:00
## Glossary
**Multinode Training**
Supported on a pyxis/enroot Slurm cluster.
**Deep Learning Compiler (DLC)**
TensorFlow XLA and PyTorch JIT and/or TorchScript
**Accelerated Linear Algebra (XLA)**
XLA is a domain-specific compiler for linear algebra that can accelerate TensorFlow models with potentially no source code changes. The results are improvements in speed and memory usage.
**PyTorch JIT and/or TorchScript**
2021-01-08 23:34:55 +01:00
TorchScript is a way to create serializable and optimizable models from PyTorch code. TorchScript, an intermediate representation of a PyTorch model (subclass of nn.Module) that can then be run in a high-performance environment such as C++.
2020-12-14 16:33:43 +01:00
**Automatic Mixed Precision (AMP)**
2020-12-14 16:36:50 +01:00
Automatic Mixed Precision (AMP) enables mixed precision training on Volta, Turing, and NVIDIA Ampere GPU architectures automatically.
2020-12-14 16:33:43 +01:00
**TensorFloat-32 (TF32)**
TensorFloat-32 (TF32) is the new math mode in [NVIDIA A100 ](https://www.nvidia.com/en-us/data-center/a100/ ) GPUs for handling the matrix math also called tensor operations. TF32 running on Tensor Cores in A100 GPUs can provide up to 10x speedups compared to single-precision floating-point math (FP32) on Volta GPUs. TF32 is supported in the NVIDIA Ampere GPU architecture and is enabled by default.
**Jupyter Notebooks (NB)**
The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
2018-05-03 00:47:16 +02:00
## Feedback / Contributions
2018-08-05 05:23:38 +02:00
We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!
2018-05-03 02:40:35 +02:00
## Known issues
In each of the network READMEs, we indicate any known issues and encourage the community to provide feedback.