switch ordering in readme
This commit is contained in:
parent
890fc1c143
commit
71fea240de
1 changed files with 26 additions and 34 deletions
|
@ -1081,7 +1081,6 @@ BERT BASE FP16
|
||||||
| 384 | 4 | 318.45 | 12.56 | 12.65 | 12.76 | 13.36 |
|
| 384 | 4 | 318.45 | 12.56 | 12.65 | 12.76 | 13.36 |
|
||||||
| 384 | 8 | 380.14 | 21.05 | 21.1 | 21.25 | 21.83 |
|
| 384 | 8 | 380.14 | 21.05 | 21.1 | 21.25 | 21.83 |
|
||||||
|
|
||||||
|
|
||||||
BERT BASE FP32
|
BERT BASE FP32
|
||||||
|
|
||||||
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
||||||
|
@ -1096,7 +1095,6 @@ BERT BASE FP32
|
||||||
| 384 | 8 | 139.75 | 57.25 | 57.74 | 58.08 | 59.53 |
|
| 384 | 8 | 139.75 | 57.25 | 57.74 | 58.08 | 59.53 |
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above.
|
To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above.
|
||||||
|
|
||||||
##### Inference performance: NVIDIA DGX-2 (1x V100 32G)
|
##### Inference performance: NVIDIA DGX-2 (1x V100 32G)
|
||||||
|
@ -1126,7 +1124,6 @@ BERT LARGE FP16
|
||||||
| 384 | 4 | 121.04 | 33.05 | 33.08 | 33.31 | 34.97 |
|
| 384 | 4 | 121.04 | 33.05 | 33.08 | 33.31 | 34.97 |
|
||||||
| 384 | 8 | 142.03 | 56.33 | 56.46 | 57.49 | 59.85 |
|
| 384 | 8 | 142.03 | 56.33 | 56.46 | 57.49 | 59.85 |
|
||||||
|
|
||||||
|
|
||||||
BERT LARGE FP32
|
BERT LARGE FP32
|
||||||
|
|
||||||
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
||||||
|
@ -1140,7 +1137,6 @@ BERT LARGE FP32
|
||||||
| 384 | 4 | 42.79 | 93.48 | 94.73 | 96.52 | 104.37 |
|
| 384 | 4 | 42.79 | 93.48 | 94.73 | 96.52 | 104.37 |
|
||||||
| 384 | 8 | 45.91 | 174.24 | 175.34 | 176.59 | 183.76 |
|
| 384 | 8 | 45.91 | 174.24 | 175.34 | 176.59 | 183.76 |
|
||||||
|
|
||||||
|
|
||||||
BERT BASE FP16
|
BERT BASE FP16
|
||||||
|
|
||||||
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
||||||
|
@ -1154,8 +1150,6 @@ BERT BASE FP16
|
||||||
| 384 | 4 | 318.45 | 12.56 | 12.65 | 12.76 | 13.36 |
|
| 384 | 4 | 318.45 | 12.56 | 12.65 | 12.76 | 13.36 |
|
||||||
| 384 | 8 | 380.14 | 21.05 | 21.1 | 21.25 | 21.83 |
|
| 384 | 8 | 380.14 | 21.05 | 21.1 | 21.25 | 21.83 |
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
BERT BASE FP32
|
BERT BASE FP32
|
||||||
|
|
||||||
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
||||||
|
@ -1176,6 +1170,32 @@ BERT BASE FP32
|
||||||
|
|
||||||
Our results were obtained by running the `scripts/finetune_inference_benchmark.sh` training script in the TensorFlow 19.06-py3 NGC container on NVIDIA Tesla T4 with 1x T4 16G GPUs. Performance numbers (throughput in sentences per second and latency in milliseconds) were averaged from 1024 iterations. Latency is computed as the time taken for a batch to process as they are fed in one after another in the model ie no pipelining.
|
Our results were obtained by running the `scripts/finetune_inference_benchmark.sh` training script in the TensorFlow 19.06-py3 NGC container on NVIDIA Tesla T4 with 1x T4 16G GPUs. Performance numbers (throughput in sentences per second and latency in milliseconds) were averaged from 1024 iterations. Latency is computed as the time taken for a batch to process as they are fed in one after another in the model ie no pipelining.
|
||||||
|
|
||||||
|
BERT LARGE FP16
|
||||||
|
|
||||||
|
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
||||||
|
|-----------------|------------|------------------------------|---------------------|-----------------|-----------------|-----------------|
|
||||||
|
| 128 | 1 | 53.56 | 18.67 | 20.22 | 20.31 | 20.49 |
|
||||||
|
| 128 | 2 | 95.39 | 20.97 | 22.86 | 23.15 | 23.73 |
|
||||||
|
| 128 | 4 | 137.44 | 29.1 | 30.34 | 30.62 | 31.5 |
|
||||||
|
| 128 | 8 | 166.19 | 48.14 | 49.38 | 49.73 | 50.86 |
|
||||||
|
| 384 | 1 | 34.28 | 29.17 | 30.58 | 30.77 | 31.28 |
|
||||||
|
| 384 | 2 | 41.89 | 47.74 | 49.05 | 49.34 | 50 |
|
||||||
|
| 384 | 4 | 47.15 | 84.83 | 86.79 | 87.41 | 88.73 |
|
||||||
|
| 384 | 8 | 50.28 | 159.11 | 161.75 | 162.85 | 165.72 |
|
||||||
|
|
||||||
|
BERT LARGE FP32
|
||||||
|
|
||||||
|
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
||||||
|
|-----------------|------------|------------------------------|---------------------|-----------------|-----------------|-----------------|
|
||||||
|
| 128 | 1 | 40.34 | 24.79 | 26.97 | 27.38 | 28.21 |
|
||||||
|
| 128 | 2 | 45.17 | 44.27 | 46.01 | 46.6 | 47.68 |
|
||||||
|
| 128 | 4 | 47.39 | 84.41 | 86.31 | 86.92 | 88.14 |
|
||||||
|
| 128 | 8 | 46.98 | 170.29 | 173.35 | 174.15 | 175.48 |
|
||||||
|
| 384 | 1 | 14.07 | 71.06 | 73 | 73.42 | 73.99 |
|
||||||
|
| 384 | 2 | 14.91 | 134.17 | 136.72 | 137.51 | 138.66 |
|
||||||
|
| 384 | 4 | 14.44 | 277.03 | 281.89 | 282.63 | 284.41 |
|
||||||
|
| 384 | 8 | 14.95 | 534.94 | 540.45 | 542.32 | 544.75 |
|
||||||
|
|
||||||
BERT BASE FP16
|
BERT BASE FP16
|
||||||
|
|
||||||
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
||||||
|
@ -1203,34 +1223,6 @@ BERT BASE FP32
|
||||||
| 384 | 8 | 48.04 | 166.51 | 169.9 | 170.84 | 172.6 |
|
| 384 | 8 | 48.04 | 166.51 | 169.9 | 170.84 | 172.6 |
|
||||||
|
|
||||||
|
|
||||||
BERT LARGE FP16
|
|
||||||
|
|
||||||
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
|
||||||
|-----------------|------------|------------------------------|---------------------|-----------------|-----------------|-----------------|
|
|
||||||
| 128 | 1 | 53.56 | 18.67 | 20.22 | 20.31 | 20.49 |
|
|
||||||
| 128 | 2 | 95.39 | 20.97 | 22.86 | 23.15 | 23.73 |
|
|
||||||
| 128 | 4 | 137.44 | 29.1 | 30.34 | 30.62 | 31.5 |
|
|
||||||
| 128 | 8 | 166.19 | 48.14 | 49.38 | 49.73 | 50.86 |
|
|
||||||
| 384 | 1 | 34.28 | 29.17 | 30.58 | 30.77 | 31.28 |
|
|
||||||
| 384 | 2 | 41.89 | 47.74 | 49.05 | 49.34 | 50 |
|
|
||||||
| 384 | 4 | 47.15 | 84.83 | 86.79 | 87.41 | 88.73 |
|
|
||||||
| 384 | 8 | 50.28 | 159.11 | 161.75 | 162.85 | 165.72 |
|
|
||||||
|
|
||||||
BERT LARGE FP32
|
|
||||||
|
|
||||||
| Sequence Length | Batch Size | Throughput-Average(sent/sec) | Latency-Average(ms) | Latency-90%(ms) | Latency-95%(ms) | Latency-99%(ms) |
|
|
||||||
|-----------------|------------|------------------------------|---------------------|-----------------|-----------------|-----------------|
|
|
||||||
| 128 | 1 | 40.34 | 24.79 | 26.97 | 27.38 | 28.21 |
|
|
||||||
| 128 | 2 | 45.17 | 44.27 | 46.01 | 46.6 | 47.68 |
|
|
||||||
| 128 | 4 | 47.39 | 84.41 | 86.31 | 86.92 | 88.14 |
|
|
||||||
| 128 | 8 | 46.98 | 170.29 | 173.35 | 174.15 | 175.48 |
|
|
||||||
| 384 | 1 | 14.07 | 71.06 | 73 | 73.42 | 73.99 |
|
|
||||||
| 384 | 2 | 14.91 | 134.17 | 136.72 | 137.51 | 138.66 |
|
|
||||||
| 384 | 4 | 14.44 | 277.03 | 281.89 | 282.63 | 284.41 |
|
|
||||||
| 384 | 8 | 14.95 | 534.94 | 540.45 | 542.32 | 544.75 |
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above.
|
To achieve these same results, follow the [Quick Start Guide](#quick-start-guide) outlined above.
|
||||||
|
|
||||||
## Release notes
|
## Release notes
|
||||||
|
|
Loading…
Reference in a new issue