.. | ||
bert_squad_tf_finetuning.ipynb | ||
bert_squad_tf_inference.ipynb | ||
bert_squad_tf_inference_colab.ipynb | ||
input.json | ||
README.md |
# Licensed under the Apache License, Version 2.0 (the "License")
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
BERT Question Answering Inference/Fine-Tuning with Mixed Precision
1. Overview
Bidirectional Embedding Representations from Transformers (BERT), is a method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks.
The original paper can be found here: https://arxiv.org/abs/1810.04805.
NVIDIA's BERT 19.10 is an optimized version of Google's official implementation, leveraging mixed precision arithmetic and tensor cores on V100 GPUS for faster training times while maintaining target accuracy.
1.a Learning objectives
This repository contains multiple notebooks which demonstrate:
- Inference on QA task with BERT Large model
- The use/download of pretrained NVIDIA BERT models
- Fine-Tuning on SQuaD 2.0 Dataset
- Use of Mixed Precision for Inference and Fine-Tuning
Here is a short description of each relevant file:
- bert_squad_tf_inference.ipynb : BERT Q&A Inference with TF Checkpoint model
- bert_squad_tf_finetuning.ipynb : BERT Fine-Tuning on SQuaD dataset
2. Quick Start Guide
2.a Build the BERT TensorFlow NGC container:
To run the notebook you first need to build the Bert TensorFlow container using the following command from the main directory of this repository:
docker build . --rm -t bert
2.b Dataset
We need to download the vocabulary and the bert_config files:
python3 /workspace/bert/data/bertPrep.py --action download --dataset google_pretrained_weights # Includes vocab
This is only needed during fine-tuning in order to download the Squad dataset:
python3 /workspace/bert/data/bertPrep.py --action download --dataset squad
2.c Start of the NGC container to run inference:
Once the image is built, you need to run the container with the --publish 0.0.0.0:8888:8888
option to publish Jupyter's port 8888
to the host machine
at port 8888
over all network interfaces (0.0.0.0
):
nvidia-docker run \
-v $PWD:/workspace/bert \
-v $PWD/results:/results \
--shm-size=1g \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
--publish 0.0.0.0:8888:8888 \
-it bert:latest bash
Then you can use the following command within the BERT Tensorflow container under
/workspace/bert
:
jupyter notebook --ip=0.0.0.0 --allow-root
And navigate a web browser to the IP address or hostname of the host machine
at port 8888
:
http://[host machine]:8888
Use the token listed in the output from running the jupyter
command to log
in, for example:
http://[host machine]:8888/?token=aae96ae9387cd28151868fee318c3b3581a2d794f3b25c6b