[BERT/PyT] remove redundant section (#690)

This commit is contained in:
Sharath T S 2020-09-16 17:06:29 -07:00 committed by GitHub
parent aacbda693a
commit a74236afd4
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -22,7 +22,6 @@ This repository provides a script and recipe to train the BERT model for PyTorch
* [Pre-training parameters](#pre-training-parameters)
* [Fine tuning parameters](#fine-tuning-parameters)
* [Multi-node](#multi-node)
* [Fine-tuning parameters](#fine-tuning-parameters)
* [Command-line options](#command-line-options)
* [Getting the data](#getting-the-data)
* [Dataset guidelines](#dataset-guidelines)
@ -472,7 +471,7 @@ Default arguments are listed below in the order `scripts/run_glue.sh` expects:
- Initial checkpoint - The default is `/workspace/bert/checkpoints/bert_uncased.pt`.
- Data directory - The default is `/workspace/bert/data/download/glue/MRPC/`.
- Vocabulary file (token to ID mapping) - The default is `/workspace/bert/data/download/google_pretrained_weights/uncased_L-24_H-1024_A-16/vocab.txt`.
- Vocabulary file (token to ID mapping) - The default is `/workspace/bert/vocab/vocab`.
- Config file for the BERT model (It should be the same as the pretrained model) - The default is `/workspace/bert/bert_config.json`.
- Output directory for result - The default is `/workspace/bert/results/MRPC`.
- The name of the GLUE task (`mrpc` or `sst-2`) - The default is `mrpc`
@ -506,139 +505,6 @@ Note that the `run.sub` script is a starting point that has to be adapted depend
Refer to the files contents to see the full list of variables to adjust for your system.
#### Fine-tuning parameters
* SQuAD
The `run_squad.py` script contains many of the same arguments as `run_pretraining.py`.
The main script specific parameters are:
```
--bert_model BERT_MODEL - Specifies the type of BERT model to use;
should be one of the following:
bert-base-uncased
bert-large-uncased
bert-base-cased
bert-base-multilingual
bert-base-chinese
--train_file TRAIN_FILE - Path to the SQuAD json for training.
For example, train-v1.1.json.
--predict_file PREDICT_FILE - Path to the SQuAD json for predictions.
For example, dev-v1.1.json or test-v1.1.json.
--max_seq_length MAX_SEQ_LENGTH
- The maximum total input sequence length
after WordPiece tokenization.
Sequences longer than this will be truncated,
and sequences shorter than this will be padded.
--doc_stride DOC_STRIDE - When splitting up a long document into chunks
this parameters sets how much stride to take
between chunks of tokens.
--max_query_length MAX_QUERY_LENGTH
- The maximum number of tokens for the question.
Questions longer than <max_query_length>
will be truncated to the value specified.
--n_best_size N_BEST_SIZE - The total number of n-best predictions to
generate in the nbest_predictions.json
output file.
--max_answer_length MAX_ANSWER_LENGTH
- The maximum length of an answer that can be
generated. This is needed because the start and
end predictions are not conditioned on one another.
--verbose_logging - If true, all the warnings related to data
processing will be printed. A number of warnings
are expected for a normal SQuAD evaluation.
--do_lower_case - Whether to lower case the input text. Set to
true for uncased models and false for cased models.
--version_2_with_negative - If true, the SQuAD examples contain questions
that do not have an answer.
--null_score_diff_threshold NULL_SCORE_DIFF_THRES HOLD
- A null answer will be predicted if null_score if
best_non_null is greater than NULL_SCORE_DIFF_THRESHOLD.
```
* GLUE
The `run_glue.py` script contains many of the same arguments as `run_pretraining.py`.
The main script specific parameters are:
```
--data_dir DATA_DIR The input data dir. Should contain the .tsv files (or
other data files) for the task.
--bert_model BERT_MODEL
Bert pre-trained model selected in the list: bert-
base-uncased, bert-large-uncased, bert-base-cased,
bert-large-cased, bert-base-multilingual-uncased,
bert-base-multilingual-cased, bert-base-chinese.
--task_name {cola,mnli,mrpc,sst-2}
The name of the task to train.
--output_dir OUTPUT_DIR
The output directory where the model predictions and
checkpoints will be written.
--init_checkpoint INIT_CHECKPOINT
The checkpoint file from pretraining
--max_seq_length MAX_SEQ_LENGTH
The maximum total input sequence length after
WordPiece tokenization. Sequences longer than this
will be truncated, and sequences shorter than this
will be padded.
--do_train Whether to run training.
--do_eval Whether to get model-task performance on the dev set
by running eval.
--do_predict Whether to output prediction results on the dev set by
running eval.
--do_lower_case Set this flag if you are using an uncased model.
--train_batch_size TRAIN_BATCH_SIZE
Batch size per GPU for training.
--eval_batch_size EVAL_BATCH_SIZE
Batch size per GPU for eval.
--learning_rate LEARNING_RATE
The initial learning rate for Adam.
--num_train_epochs NUM_TRAIN_EPOCHS
Total number of training epochs to perform.
--max_steps MAX_STEPS
Total number of training steps to perform.
--warmup_proportion WARMUP_PROPORTION
Proportion of training to perform linear learning rate
warmup for. E.g., 0.1 = 10% of training.
--no_cuda Whether not to use CUDA when available
--local_rank LOCAL_RANK
local_rank for distributed training on gpus
--seed SEED random seed for initialization
--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS
Number of updates steps to accumulate before
performing a backward/update pass.
--fp16 Mixed precision training
--amp Mixed precision training
--loss_scale LOSS_SCALE
Loss scaling to improve fp16 numeric stability. Only
used when fp16 set to True. 0 (default value): dynamic
loss scaling. Positive power of 2: static loss scaling
value.
--server_ip SERVER_IP
Can be used for distant debugging.
--server_port SERVER_PORT
Can be used for distant debugging.
--vocab_file VOCAB_FILE
Vocabulary mapping/file BERT was pretrainined on
--config_file CONFIG_FILE
The BERT model config
--skip_checkpoint Whether to save checkpoints
```
### Command-line options
To see the full list of available options and their descriptions, use the `-h` or `--help` command line option, for example: