Adding links to performance benchmark page

2021-07-21 14:39:48 +02:00 · 2021-07-21 14:39:48 +02:00 · 49e23b4597
parent 3d8d878489
commit 49e23b4597
52 changed files with 104 additions and 0 deletions
--- a/CUDA-Optimized/FastSpeech/README.md
+++ b/CUDA-Optimized/FastSpeech/README.md
@ -315,6 +315,8 @@ Sample result waveforms are [FP32](fastspeech/trt/samples) and [FP16](fastspeech

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/Kaldi/SpeechRecognition/README.md
+++ b/Kaldi/SpeechRecognition/README.md
@ -192,6 +192,8 @@ you can set `count` to `1` in the [`instance_group` section](https://docs.nvidia

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+

 ### Metrics

--- a/MxNet/Classification/RN50v1.5/README.md
+++ b/MxNet/Classification/RN50v1.5/README.md
@ -552,6 +552,8 @@ By default:

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 To benchmark training and inference, run:
--- a/PyTorch/Classification/ConvNets/efficientnet/README.md
+++ b/PyTorch/Classification/ConvNets/efficientnet/README.md
@ -492,6 +492,8 @@ Quantized models could also be used to classify new images using the `classify.p

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/PyTorch/Classification/ConvNets/resnet50v1.5/README.md
+++ b/PyTorch/Classification/ConvNets/resnet50v1.5/README.md
@ -498,6 +498,8 @@ To run inference on JPEG image using pretrained weights:

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/PyTorch/Classification/ConvNets/resnext101-32x4d/README.md
+++ b/PyTorch/Classification/ConvNets/resnext101-32x4d/README.md
@ -481,6 +481,8 @@ To run inference on JPEG image using pretrained weights:

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/PyTorch/Classification/ConvNets/se-resnext101-32x4d/README.md
+++ b/PyTorch/Classification/ConvNets/se-resnext101-32x4d/README.md
@ -483,6 +483,8 @@ To run inference on JPEG image using pretrained weights:

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/PyTorch/Classification/ConvNets/triton/resnet50/README.md
+++ b/PyTorch/Classification/ConvNets/triton/resnet50/README.md
@ -325,6 +325,8 @@ we can consider that all clients are local.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+

 ### Offline scenario
 This table lists the common variable parameters for all performance measurements:
--- a/PyTorch/Classification/ConvNets/triton/resnext101-32x4d/README.md
+++ b/PyTorch/Classification/ConvNets/triton/resnext101-32x4d/README.md
@ -194,6 +194,8 @@ To process static configuration logs, `triton/scripts/process_output.sh` script

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Dynamic batching performance
 The Triton Inference Server has a dynamic batching mechanism built-in that can be enabled. When it is enabled, the server creates inference batches from multiple received requests. This allows us to achieve better performance than doing inference on each single request. The single request is assumed to be a single image that needs to be inferenced. With dynamic batching enabled, the server will concatenate single image requests into an inference batch. The upper bound of the size of the inference batch is set to 64. All these parameters are configurable.

--- a/PyTorch/Classification/ConvNets/triton/se-resnext101-32x4d/README.md
+++ b/PyTorch/Classification/ConvNets/triton/se-resnext101-32x4d/README.md
@ -195,6 +195,8 @@ To process static configuration logs, `triton/scripts/process_output.sh` script

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Dynamic batching performance
 The Triton Inference Server has a dynamic batching mechanism built-in that can be enabled. When it is enabled, the server creates inference batches from multiple received requests. This allows us to achieve better performance than doing inference on each single request. The single request is assumed to be a single image that needs to be inferenced. With dynamic batching enabled, the server will concatenate single image requests into an inference batch. The upper bound of the size of the inference batch is set to 64. All these parameters are configurable.

--- a/PyTorch/Detection/SSD/README.md
+++ b/PyTorch/Detection/SSD/README.md
@ -565,6 +565,8 @@ To use the inference example script in your own code, you can call the `main` fu

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/PyTorch/LanguageModeling/BERT/README.md
+++ b/PyTorch/LanguageModeling/BERT/README.md
@ -692,6 +692,8 @@ For SQuAD, to run inference interactively on question-context pairs, use the scr
 The [NVIDIA Triton Inference Server](https://github.com/NVIDIA/triton-inference-server) provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or GRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server. More information on how to perform inference using NVIDIA Triton Inference Server can be found in [triton/README.md](./triton/README.md).
 
 ## Performance
+
+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
 
 ### Benchmarking
 
--- a/PyTorch/LanguageModeling/BERT/triton/README.md
+++ b/PyTorch/LanguageModeling/BERT/triton/README.md
@ -102,6 +102,8 @@ To make the machine wait until the server is initialized, and the model is ready

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 The numbers below are averages, measured on Triton on V100 32G GPU, with [static batching](https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/model_configuration.html#scheduling-and-batching). 

 | Format | GPUs | Batch size | Sequence length | Throughput - FP32(sequences/sec) | Throughput - mixed precision(sequences/sec) | Throughput speedup (mixed precision/FP32)  |
--- a/PyTorch/LanguageModeling/Transformer-XL/README.md
+++ b/PyTorch/LanguageModeling/Transformer-XL/README.md
@ -1113,6 +1113,8 @@ perplexity on the test dataset.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model
--- a/PyTorch/Recommendation/DLRM/README.md
+++ b/PyTorch/Recommendation/DLRM/README.md
@ -574,6 +574,8 @@ The NVIDIA Triton Inference Server provides a cloud inferencing solution optimiz

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/PyTorch/Recommendation/DLRM/triton/README.md
+++ b/PyTorch/Recommendation/DLRM/triton/README.md
@ -192,6 +192,8 @@ For more information about `perf_client` please refer to [official documentation

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Throughput/Latency results

 Throughput is measured in recommendations/second, and latency in milliseconds.
--- a/PyTorch/Recommendation/NCF/README.md
+++ b/PyTorch/Recommendation/NCF/README.md
@ -379,6 +379,8 @@ The script will then:

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 #### Training performance benchmark
--- a/PyTorch/Segmentation/MaskRCNN/README.md
+++ b/PyTorch/Segmentation/MaskRCNN/README.md
@ -484,6 +484,8 @@ __Note__: The score is always the Average Precision(AP) at
  - maxDets = 100
 
 ## Performance
+
+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
 
 ### Benchmarking
 Benchmarking can be performed for both training and inference. Both scripts run the Mask R-CNN model using the parameters defined in `configs/e2e_mask_rcnn_R_50_FPN_1x.yaml`. You can specify whether benchmarking is performed in FP16, TF32 or FP32 by specifying it as an argument to the benchmarking scripts.
--- a/PyTorch/Segmentation/nnUNet/README.md
+++ b/PyTorch/Segmentation/nnUNet/README.md
@ -454,6 +454,8 @@ The script will then:
                       
 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks to measure the model performance in training and inference modes.
--- a/PyTorch/Segmentation/nnUNet/triton/README.md
+++ b/PyTorch/Segmentation/nnUNet/triton/README.md
@ -344,6 +344,8 @@ we can consider that all clients are local.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+

 ### Offline scenario
 This table lists the common variable parameters for all performance measurements:
--- a/PyTorch/SpeechRecognition/Jasper/README.md
+++ b/PyTorch/SpeechRecognition/Jasper/README.md
@ -567,6 +567,8 @@ More information on how to perform inference using Triton Inference Server with

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking
 The following section shows how to run benchmarks measuring the model performance in training and inference modes.

--- a/PyTorch/SpeechRecognition/Jasper/triton/README.md
+++ b/PyTorch/SpeechRecognition/Jasper/triton/README.md
@ -274,6 +274,8 @@ For more information about `perf_client`, refer to the [official documentation](

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Inference Benchmarking in Triton Inference Server

 To benchmark the inference performance on Volta Turing or Ampere GPU, run `bash triton/scripts/execute_all_perf_runs.sh` according to [Quick-Start-Guide](#quick-start-guide) Step 7.
--- a/PyTorch/SpeechSynthesis/FastPitch/README.md
+++ b/PyTorch/SpeechSynthesis/FastPitch/README.md
@ -532,6 +532,8 @@ More examples are presented on the website with [samples](https://fastpitch.gith

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model
--- a/PyTorch/SpeechSynthesis/FastPitch/triton/README.md
+++ b/PyTorch/SpeechSynthesis/FastPitch/triton/README.md
@ -342,6 +342,8 @@ we can consider that all clients are local.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+


 ### Offline scenario
--- a/PyTorch/SpeechSynthesis/Tacotron2/README.md
+++ b/PyTorch/SpeechSynthesis/Tacotron2/README.md
@ -524,6 +524,8 @@ python inference.py --tacotron2 <Tacotron2_checkpoint> --waveglow <WaveGlow_chec

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model
--- a/PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp/README.md
+++ b/PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp/README.md
@ -160,6 +160,8 @@ By default the `./build_trtis.sh` script builds the TensorRT engines with FP16 m

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 The following tables show inference statistics for the Tacotron2 and WaveGlow
 text-to-speech system.
 The tables include average latency, latency standard deviation,
--- a/PyTorch/Translation/GNMT/README.md
+++ b/PyTorch/Translation/GNMT/README.md
@ -932,6 +932,8 @@ To view all available options for inference, run `python3 translate.py --help`.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking
 The following section shows how to run benchmarks measuring the model
 performance in training and inference modes.
--- a/PyTorch/Translation/Transformer/README.md
+++ b/PyTorch/Translation/Transformer/README.md
@ -364,6 +364,8 @@ sacrebleu -t wmt14/full -l en-de --echo src | python inference.py --buffer-size

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/Classification/ConvNets/resnet50v1.5/README.md
+++ b/TensorFlow/Classification/ConvNets/resnet50v1.5/README.md
@ -451,6 +451,8 @@ The optional `--xla` and `--amp` flags control XLA and AMP during inference.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/Classification/ConvNets/resnext101-32x4d/README.md
+++ b/TensorFlow/Classification/ConvNets/resnext101-32x4d/README.md
@ -420,6 +420,8 @@ The optional `--xla` and `--amp` flags control XLA and AMP during inference.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/Classification/ConvNets/se-resnext101-32x4d/README.md
+++ b/TensorFlow/Classification/ConvNets/se-resnext101-32x4d/README.md
@ -415,6 +415,8 @@ The optional `--xla` and `--amp` flags control XLA and AMP during inference.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/Classification/ConvNets/triton/README.md
+++ b/TensorFlow/Classification/ConvNets/triton/README.md
@ -345,6 +345,8 @@ we can consider that all clients are local.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+

 ### Offline scenario
 This table lists the common variable parameters for all performance measurements:
--- a/TensorFlow/Detection/SSD/README.md
+++ b/TensorFlow/Detection/SSD/README.md
@ -394,6 +394,8 @@ For information about:

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking
 The following section shows how to run benchmarks measuring the model performance in training and inference modes.

--- a/TensorFlow/LanguageModeling/BERT/README.md
+++ b/TensorFlow/LanguageModeling/BERT/README.md
@ -679,6 +679,8 @@ More information on how to download a biomedical corpus and pre-train as well as

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/LanguageModeling/BERT/biobert/README.md
+++ b/TensorFlow/LanguageModeling/BERT/biobert/README.md
@ -396,6 +396,8 @@ This script computes F1, Precision and Recall scores. Mount point of `/results`

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/LanguageModeling/BERT/triton/README.md
+++ b/TensorFlow/LanguageModeling/BERT/triton/README.md
@ -180,6 +180,8 @@ For more information about `perf_client`, refer to the [official documentation](

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Latency vs Throughput for TensorRT Engine

 Performance numbers for BERT Large, sequence length=384 are obtained from [experiments](https://github.com/NVIDIA/TensorRT/tree/release/7.1/demo/BERT#inference-performance-nvidia-a100-40gb) on NVIDIA A100 with 1x A100 40G GPUs. Throughput is measured in samples/second, and latency in milliseconds.
--- a/TensorFlow/LanguageModeling/Transformer-XL/README.md
+++ b/TensorFlow/LanguageModeling/Transformer-XL/README.md
@ -756,6 +756,8 @@ which contains information about loss, perplexity and execution performance on t

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model
--- a/TensorFlow/Recommendation/NCF/README.md
+++ b/TensorFlow/Recommendation/NCF/README.md
@ -379,6 +379,8 @@ checkpoint specified by the `--checkpoint-dir` file.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model
--- a/TensorFlow/Recommendation/VAE-CF/README.md
+++ b/TensorFlow/Recommendation/VAE-CF/README.md
@ -336,6 +336,8 @@ This will generate a user with a collection of random items that they interacted

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/Recommendation/WideAndDeep/README.md
+++ b/TensorFlow/Recommendation/WideAndDeep/README.md
@ -373,6 +373,8 @@ Checkpoints are stored with every evaluation at the `--model_dir` location.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training mode.
--- a/TensorFlow/Segmentation/UNet_3D_Medical/README.md
+++ b/TensorFlow/Segmentation/UNet_3D_Medical/README.md
@ -423,6 +423,8 @@ The script will then:
 * Save the resulting masks in the `numpy` format in the `--model_dir` directory.
 
 ## Performance
+
+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
 
 ### Benchmarking
 
--- a/TensorFlow/Segmentation/UNet_Industrial/README.md
+++ b/TensorFlow/Segmentation/UNet_Industrial/README.md
@ -373,6 +373,8 @@ but rather in an approximate fashion.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following sections shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow/Segmentation/UNet_Medical/README.md
+++ b/TensorFlow/Segmentation/UNet_Medical/README.md
@ -422,6 +422,8 @@ The script will then:
 * Save the resulting binary masks in a `TIF` format.
 
 ## Performance
+
+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
 
 ### Benchmarking
 
--- a/TensorFlow/Segmentation/VNet/README.md
+++ b/TensorFlow/Segmentation/VNet/README.md
@ -382,6 +382,8 @@ This script should produce the prediction results over a set of masks which will

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ###  Benchmarking

 Starting from CuDNN 7.6.2, enhanced support for 3D convolutions in mixed precision has been introduced to our containers. This enhanced support accelerates even further both training and inference, while maintaining the reduction of the model's memory footprint characteristic of mixed precision training.
--- a/TensorFlow/Translation/GNMT/README.md
+++ b/TensorFlow/Translation/GNMT/README.md
@ -497,6 +497,8 @@ To view all available options for translation, run `python nmt.py --help`.

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow2/Classification/ConvNets/efficientnet/README.md
+++ b/TensorFlow2/Classification/ConvNets/efficientnet/README.md
@ -427,6 +427,8 @@ Run:

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow2/LanguageModeling/BERT/README.md
+++ b/TensorFlow2/LanguageModeling/BERT/README.md
@ -626,6 +626,8 @@ I0424 23:59:50.031537 139905798453056 run_squad.py:302] ------------------------

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+

 ### Benchmarking

--- a/TensorFlow2/LanguageModeling/ELECTRA/README.md
+++ b/TensorFlow2/LanguageModeling/ELECTRA/README.md
@ -733,6 +733,8 @@ To run inference interactively on question-context pairs, use the script `run_in
 
 
 ## Performance
+
+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
 
 ### Benchmarking
 
--- a/TensorFlow2/Recommendation/DLRM/README.md
+++ b/TensorFlow2/Recommendation/DLRM/README.md
@ -481,6 +481,8 @@ of samples processed per second. We use mixed precision training with static los

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow2/Recommendation/WideAndDeep/README.md
+++ b/TensorFlow2/Recommendation/WideAndDeep/README.md
@ -460,6 +460,8 @@ After the whole evaluation, the total evaluation metrics are logged,  loss, bina

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training mode.
--- a/TensorFlow2/Segmentation/MaskRCNN/README.md
+++ b/TensorFlow2/Segmentation/MaskRCNN/README.md
@ -405,6 +405,8 @@ The results are displayed in the console and are saved in `./mrcnn-dll.json` (ca

 ## Performance

+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
+
 ### Benchmarking

 The following section shows how to run benchmarks measuring the model performance in training and inference modes.
--- a/TensorFlow2/Segmentation/UNet_Medical/README.md
+++ b/TensorFlow2/Segmentation/UNet_Medical/README.md
@ -435,6 +435,8 @@ The script will then:
 * Save the resulting binary masks in a `TIF` format.
 
 ## Performance
+
+The performance measurements in this document were conducted at the time of publication and may not reflect the performance achieved from NVIDIA’s latest software release. For the most up-to-date performance measurements, go to [NVIDIA Data Center Deep Learning Product Performance](https://developer.nvidia.com/deep-learning-performance-training-inference).
 
 ### Benchmarking