1. Using install the OpenNMT-tf to replace clone the OpenNMT-tf repo.

2020-03-06 01:41:43 +00:00 · 2020-03-06 01:41:43 +00:00 · 9560a304fa
parent bd89bca344
commit 9560a304fa
4 changed files with 29 additions and 22 deletions
--- a/FasterTransformer/v2/.gitmodules
+++ b/FasterTransformer/v2/.gitmodules
@ -2,6 +2,3 @@
 	path = sample/tensorflow_bert/bert
 	url = https://github.com/google-research/bert.git

-[submodule "OpenNMT-tf"]
-	path = OpenNMT-tf
-	url = https://github.com/OpenNMT/OpenNMT-tf
--- a/FasterTransformer/v2/CMakeLists.txt
+++ b/FasterTransformer/v2/CMakeLists.txt
@ -53,12 +53,21 @@ set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -gencode=arch=compute_${SM},code=\\\"s
    set(CMAKE_CXX_FLAGS  "${CMAKE_CXX_FLAGS}  -DWMMA")
    set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -DWMMA")
  endif()
-
 message("-- Assign GPU architecture (sm=${SM})")
+
 else()
-set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -gencode=arch=compute_60,code=\\\"sm_60,compute_60\\\" -rdc=true")
-message("-- Unknown or unsupported GPU architecture (set sm=60)")
+set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS}  \
+                      -gencode=arch=compute_60,code=\\\"sm_60,compute_60\\\" \
+                      -gencode=arch=compute_61,code=\\\"sm_61,compute_61\\\" \
+                      -gencode=arch=compute_70,code=\\\"sm_70,compute_70\\\" \
+                      -gencode=arch=compute_75,code=\\\"sm_75,compute_75\\\" \
+                      -rdc=true")
+set(CMAKE_C_FLAGS    "${CMAKE_C_FLAGS}    -DWMMA")
+set(CMAKE_CXX_FLAGS  "${CMAKE_CXX_FLAGS}  -DWMMA")
+set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -DWMMA")
+message("-- Assign GPU architecture (sm=60,61,70,75)")
 endif()
+
 set(CMAKE_C_FLAGS_DEBUG    "${CMAKE_C_FLAGS_DEBUG}    -Wall -O0")
 set(CMAKE_CXX_FLAGS_DEBUG  "${CMAKE_CXX_FLAGS_DEBUG}  -Wall -O0")
 set(CMAKE_CUDA_FLAGS_DEBUG "${CMAKE_CUDA_FLAGS_DEBUG} -O0 -G -Xcompiler -Wall")
@ -118,7 +127,7 @@ if(BUILD_TF)
      COMMAND cp ${PROJECT_SOURCE_DIR}/sample/tensorflow/*.py ${PROJECT_SOURCE_DIR}/build/
      COMMAND cp ${PROJECT_SOURCE_DIR}/sample/tensorflow/utils ${PROJECT_SOURCE_DIR}/build/ -r
      COMMAND cp ${PROJECT_SOURCE_DIR}/sample/tensorflow/scripts ${PROJECT_SOURCE_DIR}/build/ -r
-      COMMAND cp ${PROJECT_SOURCE_DIR}/sample/tensorflow_bert ${PROJECT_SOURCE_DIR}/build/ -r
-  )
+      # COMMAND cp ${PROJECT_SOURCE_DIR}/sample/tensorflow_bert ${PROJECT_SOURCE_DIR}/build/ -r 
+ )
 endif()

--- a/FasterTransformer/v2/README.md
+++ b/FasterTransformer/v2/README.md
@ -221,7 +221,7 @@ python encoder_sample.py \
        --data_type fp16 \
        --test_time 1
 ```
-
+<!--
 3. Run the FasterTransformer in BERT.

 The following script demonstrates how to integrate the FasterTransformer into a BERT model. 
@ -246,7 +246,7 @@ python tensorflow_bert/profile_transformer_inference.py \
        --xla=false \
        --floatx=float32
 ```
-
+-->
 ### Execute the decoding demos

 1. Generate the `decoding_gemm_config.in` file. 
@ -639,18 +639,19 @@ python encoder_decoding_sample.py \

 #### Translation progress

-For translation, we need to use some tools and library of OpenNMT-tf to prepocess the source sentence and build the encoder.
-Because the encoder of FasterTransformer is based on BERT, it cannot be restore the pretrained model. So, it requires to use the encoder of OpenNMT-tf.
+This subsection demonstrates how to use FasterTansformer decoding to translate a sentence. We use the pretrained model and testing data in [OpenNMT-tf](https://opennmt.net/Models-tf/), which translate from English to German. 

-1. Prepare the pretrained model and the data for translation.
+Because the FasterTransformer Encoder is based on BERT, we cannot restore the model of encoder of OpenNMT to FasterTransformer Encoder. Therefore, we use OpenNMT-tf to build the encoder and preprocess the source sentence.
+
+Another problem is that the implementation of FasterTransformer Decoder and decoder of OpenNMT-tf is a little different. For example, the decoder of OpenNMT-tf uses one convolution to compute query, key and value in masked-multihead-attention; but FasterTransformer Decoder splits them into three gemms. The tool `utils/dump_model.py` will convert the pretrained model to fit the model structure of FasterTransformer Decoder.
+
+`download_model_data.sh` will install the OpenNMT-tf v1, downloads the pretrained model into the `translation` folder, and convert the model. 

 ```bash
 bash utils/translation/download_model_data.sh
 ```

-`download_model_data.sh` will prepare the `opennmt` folder, which contains the input embedding and the encoder, download the pretrained model, and download the test data into the `translation` folder. This is because the encoder of FasterTransformer is based on BERT, but not OpenNMT-tf, so we cannot restore the pretrained model of OpenNMT-tf for encoder. Therefore, translation requires the encoder of OpenNMT-tf.
-
-Another problem is that the implementation of our tf_decoding and OpenNMT-tf decoding is a little different. For example, OpenNMT-tf uses one gemm to compute query, key and values in one time; but tf_decoding splits them into three gemms. So, the tool `utils/dump_model.py` will convert the pretrained model to fit the model structure of decoder of FasterTransformer.  
+Then run the translation sample by the following script:

 ```bash
 ./bin/decoding_gemm 1 4 8 64 32001 100 512 0
@ -784,8 +785,9 @@ bash scripts/profile_decoding_op_performance.sh

 March 2020
 - Add feature in FasterTransformer 2.0
-  - Fix the bug of maximum sequence length of decoder cannot be larger than 128.
  - Add `translate_sample.py` to demonstrate how to translate a sentence by restoring the pretrained model of OpenNMT-tf.
+- Fix bugs of Fastertransformer 2.0
+  - Fix the bug of maximum sequence length of decoder cannot be larger than 128.
  - Fix the bug that decoding does not check finish or not after each step. 
  - Fix the bug of decoder about max_seq_len.
  - Modify the decoding model structure to fit the OpenNMT-tf decoding model. 
--- a/FasterTransformer/v2/sample/tensorflow/utils/translation/download_model_data.sh
+++ b/FasterTransformer/v2/sample/tensorflow/utils/translation/download_model_data.sh
@ -1,7 +1,5 @@
-# Clone the OpenNMT-tf repo
-git clone https://github.com/OpenNMT/OpenNMT-tf -b r1
-cp OpenNMT-tf/opennmt/ . -r
-rm OpenNMT-tf -r
+# Install the OpenNMT-tf v1
+pip install opennmt-tf==1.25.1

 # Download the vocabulary and test data
 # wget https://s3.amazonaws.com/opennmt-trainingdata/wmt_ende_sp.tar.gz
@ -14,7 +12,8 @@ mkdir translation/ckpt
 # mkdir translation/data
 # tar xf wmt_ende_sp.tar.gz -C translation/data
 tar xf averaged-ende-ckpt500k.tar.gz -C translation/ckpt
-rm wmt_ende_sp.tar.gz averaged-ende-ckpt500k.tar.gz
+# rm wmt_ende_sp.tar.gz 
+rm averaged-ende-ckpt500k.tar.gz
 # head -n 5 translation/data/test.en > test.en
 # head -n 5 translation/data/test.de > test.de