DeepLearningExamples/TensorFlow/docs/amp/notebook_v1.14/auto_mixed_precision_demo_cifar10.ipynb

{"nbformat":4,"nbformat_minor":0,"metadata":{"accelerator":"GPU","colab":{"name":"auto_mixed_precision_demo_cifar10.ipynb","version":"0.3.2","provenance":[],"collapsed_sections":[]},"jupytext":{"formats":"ipynb,py"},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.6.8"}},"cells":[{"cell_type":"code","metadata":{"colab_type":"code","id":"tuOe1ymfHZPu","colab":{}},"source":["# Copyright 2019 NVIDIA Corporation. All Rights Reserved.\n","#\n","# Licensed under the Apache License, Version 2.0 (the \"License\");\n","# you may not use this file except in compliance with the License.\n","# You may obtain a copy of the License at\n","#\n","#     http://www.apache.org/licenses/LICENSE-2.0\n","#\n","# Unless required by applicable law or agreed to in writing, software\n","# distributed under the License is distributed on an \"AS IS\" BASIS,\n","# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","# See the License for the specific language governing permissions and\n","# limitations under the License.\n","# =============================================================================="],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"MfBg1C5NB3X0"},"source":["<img src=\"https://upload.wikimedia.org/wikipedia/en/thumb/6/6d/Nvidia_image_logo.svg/200px-Nvidia_image_logo.svg.png\" width=\"90px\" align=\"right\" style=\"margin-right: 0px;\">\n","\n","# Mixed Precision Training of CNN"]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"xHxb-dlhMIzW"},"source":["## Overview\n","\n","In this example, we will speed-up the training of a simple CNN with mixed precision to perform image classification on the CIFAR10 dataset.\n","\n","By using mixed precision, we can reduce the training time without a significant impact on classification accuracy. For example, using the NVIDIA Tesla T4 GPU on Google Colab, we can reduce the training time (using the same model and batch size) over 10 epochs from about 900 seconds (FP32) to less than 600 seconds with mixed precision, without sacrificing classification accuracy.\n","\n","### How mixed precision works\n","\n","**Mixed precision** is the use of both float16 and float32 data types when training a model.\n","\n","Performing arithmetic operations in float16 takes advantage of the performance gains of using specialized processing units such as the Tensor cores. Due to the smaller representable range of float16, performing the entire training with float16 data type can result in underflow of the gradients, leading to convergence or model quality issues.\n","\n","However, *performing only select arithmetic operations* in float16 results in performance gains when using compatible hardware accelerators, decreasing training time and reducing memory usage, typically without sacrificing model performance.\n","\n","To learn more about mixed precision and how it works:\n","\n","* [Overview of Automatic Mixed Precision for Deep Learning](https://developer.nvidia.com/automatic-mixed-precision)\n","* [NVIDIA Mixed Precision Training Documentation](https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html)\n","* [NVIDIA Deep Learning Performance Guide](https://docs.nvidia.com/deeplearning/sdk/dl-performance-guide/index.html)\n","* [Information about NVIDIA Tensor Cores](https://developer.nvidia.com/tensor-cores)\n","* [Post on TensorFlow blog explaining Automatic Mixed Precision](https://medium.com/tensorflow/automatic-mixed-precision-in-tensorflow-for-faster-ai-training-on-nvidia-gpus-6033234b2540)\n","\n","### TensorFlow Automatic Mixed Precision API\n","\n","The method presented in this notebook is the API used in TensorFlow 1.14 and newer: `tf.train.experimental.enable_mixed_precision_graph_rewrite()`. \n","\n","This allows you to switch to using mixed precision by simply wrappin