History

Przemek Strzelczyk 46ff3707e0 [ConvNets/PyT] Adding support for Ampere and 20.06 container		2020-06-27 09:32:20 +02:00
..
image_classification	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
img	Adding SE-ResNext and ResNext / PyT	2019-12-15 05:13:59 +01:00
resnet50v1.5	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
resnext101-32x4d	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
se-resnext101-32x4d	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
.gitmodules	ConvNets update	2019-12-20 14:54:58 +01:00
checkpoint2model.py	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
classify.py	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
Dockerfile	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
LICENSE	Adding SE-ResNext and ResNext / PyT	2019-12-15 05:13:59 +01:00
LOC_synset_mapping.json	Adding SE-ResNext and ResNext / PyT	2019-12-15 05:13:59 +01:00
main.py	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
multiproc.py	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
README.md	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00
requirements.txt	[ConvNets/PyT] Adding support for Ampere and 20.06 container	2020-06-27 09:32:20 +02:00

README.md

Convolutional Networks for Image Classification in PyTorch

In this repository you will find implementations of various image classification models.

Detailed information on each model can be found here:

Models
Validation accuracy results
Training performance results
Model comparison
- Accuracy vs FLOPS
- Latency vs Throughput on different batch sizes

Models

The following table provides links to where you can find additional information on each model:

Model	Link
resnet50	README
resnext101-32x4d	README
se-resnext101-32x4d	README

Validation accuracy results

Our results were obtained by running the applicable training scripts in the [framework-container-name] NGC container on NVIDIA DGX-1 with (8x V100 16GB) GPUs. The specific training script that was run is documented in the corresponding model's README.

The following table shows the validation accuracy results of the three classification models side-by-side.

arch	AMP Top1	AMP Top5	FP32 Top1	FP32 Top5
resnet50	78.46	94.15	78.50	94.11
resnext101-32x4d	80.08	94.89	80.14	95.02
se-resnext101-32x4d	81.01	95.52	81.12	95.54

Training performance results

Training performance: NVIDIA DGX A100 (8x A100 40GB)

Our results were obtained by running the applicable training scripts in the pytorch-20.06 NGC container on NVIDIA DGX A100 with (8x A100 40GB) GPUs. Performance numbers (in images per second) were averaged over an entire training epoch. The specific training script that was run is documented in the corresponding model's README.

The following table shows the training accuracy results of the three classification models side-by-side.

arch	Mixed Precision	TF32	Mixed Precision Speedup
resnet50	9488.39 img/s	5322.10 img/s	1.78x
resnext101-32x4d	6758.98 img/s	2353.25 img/s	2.87x
se-resnext101-32x4d	4670.72 img/s	2011.21 img/s	2.32x

ResNeXt and SE-ResNeXt use NHWC data layout when training using Mixed Precision, which improves the model performance. We are currently working on adding it for ResNet.

Training performance: NVIDIA DGX-1 16G (8x V100 16GB)

Our results were obtained by running the applicable training scripts in the pytorch-20.06 NGC container on NVIDIA DGX-1 with (8x V100 16GB) GPUs. Performance numbers (in images per second) were averaged over an entire training epoch. The specific training script that was run is documented in the corresponding model's README.

The following table shows the training accuracy results of the three classification models side-by-side.

arch	Mixed Precision	FP32	Mixed Precision Speedup
resnet50	6565.61 img/s	2869.19 img/s	2.29x
resnext101-32x4d	3922.74 img/s	1136.30 img/s	3.45x
se-resnext101-32x4d	2651.13 img/s	982.78 img/s	2.70x

ResNeXt and SE-ResNeXt use NHWC data layout when training using Mixed Precision, which improves the model performance. We are currently working on adding it for ResNet.

Model Comparison

Accuracy vs FLOPS

Plot describes relationship between floating point operations needed for computing forward pass on a 224px x 224px image, for the implemented models. Dot size indicates number of trainable parameters.

Latency vs Throughput on different batch sizes

Plot describes relationship between inference latency, throughput and batch size for the implemented models.