[BERT/PyT] remove redundant section (#690)

2020-09-16 17:06:29 -07:00 · 2020-09-16 17:06:29 -07:00 · a74236afd4
parent aacbda693a
commit a74236afd4
1 changed files with 1 additions and 135 deletions
--- a/PyTorch/LanguageModeling/BERT/README.md
+++ b/PyTorch/LanguageModeling/BERT/README.md
@ -22,7 +22,6 @@ This repository provides a script and recipe to train the BERT model for PyTorch
        * [Pre-training parameters](#pre-training-parameters)
        * [Fine tuning parameters](#fine-tuning-parameters)    
        * [Multi-node](#multi-node)
        * [Fine-tuning parameters](#fine-tuning-parameters)     
    * [Command-line options](#command-line-options)
    * [Getting the data](#getting-the-data)
        * [Dataset guidelines](#dataset-guidelines)
@ -472,7 +471,7 @@ Default arguments are listed below in the order `scripts/run_glue.sh` expects:
 -   Initial checkpoint - The default is `/workspace/bert/checkpoints/bert_uncased.pt`.
 -   Data directory -  The default is `/workspace/bert/data/download/glue/MRPC/`.
-   Vocabulary file (token to ID mapping) - The default is `/workspace/bert/data/download/google_pretrained_weights/uncased_L-24_H-1024_A-16/vocab.txt`.
+-   Vocabulary file (token to ID mapping) - The default is `/workspace/bert/vocab/vocab`.
 -   Config file for the BERT model (It should be the same as the pretrained model) - The default is `/workspace/bert/bert_config.json`.
 -   Output directory for result - The default is `/workspace/bert/results/MRPC`.
 -   The name of the GLUE task (`mrpc` or `sst-2`) - The default is `mrpc`
@ -506,139 +505,6 @@ Note that the `run.sub` script is a starting point that has to be adapted depend
 Refer to the files contents to see the full list of variables to adjust for your system.
 #### Fine-tuning parameters
 * SQuAD
 The `run_squad.py` script contains many of the same arguments as `run_pretraining.py`.
 The main script specific parameters are:
 ```
 --bert_model BERT_MODEL      - Specifies the type of BERT model to use;
                                should be one of the following:
        bert-base-uncased
        bert-large-uncased
        bert-base-cased
        bert-base-multilingual
        bert-base-chinese
 --train_file TRAIN_FILE      - Path to the SQuAD json for training.
                                For example, train-v1.1.json.
 --predict_file PREDICT_FILE     - Path to the SQuAD json for predictions.
                                For example, dev-v1.1.json or test-v1.1.json.
 --max_seq_length MAX_SEQ_LENGTH
                              - The maximum total input sequence length
                                after WordPiece tokenization.
                                Sequences longer than this will be truncated,
                                and sequences shorter than this will be padded.
 --doc_stride DOC_STRIDE      - When splitting up a long document into chunks
                                this parameters sets how much stride to take
                                between chunks of tokens.
 --max_query_length MAX_QUERY_LENGTH
                              - The maximum number of tokens for the question.
                                Questions longer than <max_query_length>
                                will be truncated to the value specified.
 --n_best_size N_BEST_SIZE       - The total number of n-best predictions to
                                generate in the nbest_predictions.json
                                output file.
 --max_answer_length MAX_ANSWER_LENGTH
                              - The maximum length of an answer that can be
                                generated. This is needed because the start and
                                end predictions are not conditioned on one another.
 --verbose_logging            - If true, all the warnings related to data
                                processing will be printed. A number of warnings
                                are expected for a normal SQuAD evaluation.
 --do_lower_case              - Whether to lower case the input text. Set to
                                true for uncased models and false for cased models.
 --version_2_with_negative       - If true, the SQuAD examples contain questions
                                that do not have an answer.
 --null_score_diff_threshold NULL_SCORE_DIFF_THRES HOLD
                              - A null answer will be predicted if null_score if
                                best_non_null is greater than NULL_SCORE_DIFF_THRESHOLD.
 ```
 * GLUE
 The `run_glue.py` script contains many of the same arguments as `run_pretraining.py`.
 The main script specific parameters are:
 ```
  --data_dir DATA_DIR   The input data dir. Should contain the .tsv files (or
                        other data files) for the task.
  --bert_model BERT_MODEL
                        Bert pre-trained model selected in the list: bert-
                        base-uncased, bert-large-uncased, bert-base-cased,
                        bert-large-cased, bert-base-multilingual-uncased,
                        bert-base-multilingual-cased, bert-base-chinese.
  --task_name {cola,mnli,mrpc,sst-2}
                        The name of the task to train.
  --output_dir OUTPUT_DIR
                        The output directory where the model predictions and
                        checkpoints will be written.
  --init_checkpoint INIT_CHECKPOINT
                        The checkpoint file from pretraining
  --max_seq_length MAX_SEQ_LENGTH
                        The maximum total input sequence length after
                        WordPiece tokenization. Sequences longer than this
                        will be truncated, and sequences shorter than this
                        will be padded.
  --do_train            Whether to run training.
  --do_eval             Whether to get model-task performance on the dev set
                        by running eval.
  --do_predict          Whether to output prediction results on the dev set by
                        running eval.
  --do_lower_case       Set this flag if you are using an uncased model.
  --train_batch_size TRAIN_BATCH_SIZE
                        Batch size per GPU for training.
  --eval_batch_size EVAL_BATCH_SIZE
                        Batch size per GPU for eval.
  --learning_rate LEARNING_RATE
                        The initial learning rate for Adam.
  --num_train_epochs NUM_TRAIN_EPOCHS
                        Total number of training epochs to perform.
  --max_steps MAX_STEPS
                        Total number of training steps to perform.
  --warmup_proportion WARMUP_PROPORTION
                        Proportion of training to perform linear learning rate
                        warmup for. E.g., 0.1 = 10% of training.
  --no_cuda             Whether not to use CUDA when available
  --local_rank LOCAL_RANK
                        local_rank for distributed training on gpus
  --seed SEED           random seed for initialization
  --gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS
                        Number of updates steps to accumulate before
                        performing a backward/update pass.
  --fp16                Mixed precision training
  --amp                 Mixed precision training
  --loss_scale LOSS_SCALE
                        Loss scaling to improve fp16 numeric stability. Only
                        used when fp16 set to True. 0 (default value): dynamic
                        loss scaling. Positive power of 2: static loss scaling
                        value.
  --server_ip SERVER_IP
                        Can be used for distant debugging.
  --server_port SERVER_PORT
                        Can be used for distant debugging.
  --vocab_file VOCAB_FILE
                        Vocabulary mapping/file BERT was pretrainined on
  --config_file CONFIG_FILE
                        The BERT model config
  --skip_checkpoint     Whether to save checkpoints
 ```
 ### Command-line options
 To see the full list of available options and their descriptions, use the `-h` or `--help` command line option, for example: