英文任务测评基准GLUE · dbiir/UER 您所在的位置:网站首页 sst训练模版 英文任务测评基准GLUE · dbiir/UER

英文任务测评基准GLUE · dbiir/UER

2024-06-12 18:08| 来源: 网络整理| 查看: 265

以下是GLUE任务解决方案的简要介绍。本章节主要关注单模型。可以在这里找到下面使用的预训练模型。

CoLA

利用English BERT-Base-uncased在CoLA数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/CoLA/train.tsv \ --dev_path datasets/CoLA/dev.tsv \ --output_model_path models/cola_classifier_model.bin \ --epochs_num 5 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/cola_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/CoLA/test_nolabel.tsv \ --prediction_path datasets/CoLA/prediction.tsv \ --seq_length 128 --labels_num 2

利用English RoBERTa-Large在CoLA数据集上做微调和预测示例: 由于RoBERTa-Large预训练权重使用了和BERT-Base-uncased不同的特殊字符,因此我们需要在 uer/utils/constants.py 中修改特殊字符映射表路径,从 models/special_tokens_map.json 改为 models/xlmroberta_special_tokens_map.json

python3 finetune/run_classifier.py --pretrained_model_path models/roberta_large_en_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/CoLA/train.tsv \ --dev_path datasets/CoLA/dev.tsv \ --output_model_path models/cola_classifier_model.bin \ --epochs_num 5 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/cola_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/CoLA/test_nolabel.tsv \ --prediction_path datasets/CoLA/prediction.tsv \ --seq_length 128 --labels_num 2 SST-2

利用English BERT-Base-uncased在SST-2数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/SST-2/train.tsv \ --dev_path datasets/SST-2/dev.tsv \ --output_model_path models/sst-2_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/sst-2_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/SST-2/test_nolabel.tsv \ --prediction_path datasets/SST-2/prediction.tsv \ --seq_length 128 --labels_num 2

利用English RoBERTa-Large在SST-2数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/roberta_large_en_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/SST-2/train.tsv \ --dev_path datasets/SST-2/dev.tsv \ --output_model_path models/sst-2_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/sst-2_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/SST-2/test_nolabel.tsv \ --prediction_path datasets/SST-2/prediction.tsv \ --seq_length 128 --labels_num 2 QQP

利用English BERT-Base-uncased在QQP数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/QQP/train.tsv \ --dev_path datasets/QQP/dev.tsv \ --output_model_path models/qqp_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/qqp_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/QQP/test_nolabel.tsv \ --prediction_path datasets/QQP/prediction.tsv \ --seq_length 128 --labels_num 2

利用English RoBERTa-Large在QQP数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/roberta_large_en_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/QQP/train.tsv \ --dev_path datasets/QQP/dev.tsv \ --output_model_path models/qqp_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/qqp_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/QQP/test_nolabel.tsv \ --prediction_path datasets/QQP/prediction.tsv \ --seq_length 128 --labels_num 2 QNLI

利用English BERT-Base-uncased在QNLI数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/QNLI/train.tsv \ --dev_path datasets/QNLI/dev.tsv \ --output_model_path models/qnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/qnli_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/QNLI/test_nolabel.tsv \ --prediction_path datasets/QNLI/prediction.tsv \ --seq_length 128 --labels_num 2

利用English RoBERTa-Large在QNLI数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/roberta_large_en_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/QNLI/train.tsv \ --dev_path datasets/QNLI/dev.tsv \ --output_model_path models/qnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/qnli_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/QNLI/test_nolabel.tsv \ --prediction_path datasets/QNLI/prediction.tsv \ --seq_length 128 --labels_num 2 WNLI

利用English BERT-Base-uncased在WNLI数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/WNLI/train.tsv \ --dev_path datasets/WNLI/dev.tsv \ --output_model_path models/wnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/wnli_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/WNLI/test_nolabel.tsv \ --prediction_path datasets/WNLI/prediction.tsv \ --seq_length 128 --labels_num 2

利用English RoBERTa-Large在WNLI数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/roberta_large_en_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/WNLI/train.tsv \ --dev_path datasets/WNLI/dev.tsv \ --output_model_path models/wnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/wnli_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/WNLI/test_nolabel.tsv \ --prediction_path datasets/WNLI/prediction.tsv \ --seq_length 128 --labels_num 2 MNLI

在MNLI数据集上做微调和预测示例: MNLI有两个验证集和测试集,分别为MNLI-m和MNLI-mm。 利用English BERT-Base-uncased在MNLI-m和MNLI-mm做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/MNLI/train.tsv \ --dev_path datasets/MNLI/dev_matched.tsv \ --output_model_path models/mnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/mnli_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/MNLI/test_nolabel_matched.tsv \ --prediction_path datasets/MNLI/prediction_matched.tsv \ --seq_length 128 --labels_num 3 python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/MNLI/train.tsv \ --dev_path datasets/MNLI/dev_mismatched.tsv \ --output_model_path models/mnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/mnli_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/MNLI/test_nolabel_mismatched.tsv \ --prediction_path datasets/MNLI/prediction_mismatched.tsv \ --seq_length 128 --labels_num 3

利用English RoBERTa-Large在MNLI-m和MNLI-mm做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/roberta_large_en_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/MNLI/train.tsv \ --dev_path datasets/MNLI/dev_matched.tsv \ --output_model_path models/mnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/mnli_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/MNLI/test_nolabel_matched.tsv \ --prediction_path datasets/MNLI/prediction_matched.tsv \ --seq_length 128 --labels_num 3 python3 finetune/run_classifier.py --pretrained_model_path models/roberta_large_en_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/MNLI/train.tsv \ --dev_path datasets/MNLI/dev_mismatched.tsv \ --output_model_path models/mnli_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/mnli_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/MNLI/test_nolabel_mismatched.tsv \ --prediction_path datasets/MNLI/prediction_mismatched.tsv \ --seq_length 128 --labels_num 3

RoBERTa论文中指出,MRPC、RTE和STS-B任务使用在MNLI-m上微调过的模型上进行微调,可以取得更好的效果,我们沿用了这一设定。 由于MNLI是三分类任务,MRPC、RTE是二分类任务,STS-B是回归任务,因此在使用MNLI-m上微调过的模型前需要去除掉该模型的预测层。

import torch input_model = torch.load("models/mnli_classifier_model.bin", map_location="cpu") for key in list(input_model.keys()): if "output_layer_2" in key: del input_model[key] torch.save(input_model, "models/mnli_classifier_delete_target_model.bin") MRPC

利用English BERT-Base-uncased在MRPC数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/MRPC/train.tsv \ --dev_path datasets/MRPC/dev.tsv \ --output_model_path models/mrpc_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/mrpc_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/MRPC/test_nolabel.tsv \ --prediction_path datasets/MRPC/prediction.tsv \ --seq_length 128 --labels_num 2

利用在MNLI-m上微调过的English RoBERTa-Large模型,在MRPC数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/mnli_classifier_delete_target_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/MRPC/train.tsv \ --dev_path datasets/MRPC/dev.tsv \ --output_model_path models/mrpc_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/mrpc_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/MRPC/test_nolabel.tsv \ --prediction_path datasets/MRPC/prediction.tsv \ --seq_length 128 --labels_num 2 RTE

利用English BERT-Base-uncased在RTE数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/RTE/train.tsv \ --dev_path datasets/RTE/dev.tsv \ --output_model_path models/rte_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/rte_classifier_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/RTE/test_nolabel.tsv \ --prediction_path datasets/RTE/prediction.tsv \ --seq_length 128 --labels_num 2

利用在MNLI-m上微调过的English RoBERTa-Large模型,在RTE数据集上做微调和预测示例:

python3 finetune/run_classifier.py --pretrained_model_path models/mnli_classifier_delete_target_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/RTE/train.tsv \ --dev_path datasets/RTE/dev.tsv \ --output_model_path models/rte_classifier_model.bin \ --epochs_num 3 --batch_size 32 python3 inference/run_classifier_infer.py --load_model_path models/rte_classifier_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/RTE/test_nolabel.tsv \ --prediction_path datasets/RTE/prediction.tsv \ --seq_length 128 --labels_num 2 STS-B

利用English BERT-Base-uncased在STS-B数据集上做微调和预测示例:

python3 finetune/run_regression.py --pretrained_model_path models/bert_base_en_uncased_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --train_path datasets/STS-B/train.tsv \ --dev_path datasets/STS-B/dev.tsv \ --output_model_path models/sts-b_regression_model.bin \ --epochs_num 5 --batch_size 32 python3 inference/run_regression_infer.py --load_model_path models/sts-b_regression_model.bin \ --vocab_path models/google_uncased_en_vocab.txt \ --config_path models/bert/base_config.json \ --test_path datasets/STS-B/test_nolabel.tsv \ --prediction_path datasets/STS-B/prediction.tsv \ --seq_length 128

利用在MNLI-m上微调过的English RoBERTa-Large模型,在STS-B数据集上做微调和预测示例:

python3 finetune/run_regression.py --pretrained_model_path models/mnli_classifier_delete_target_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --train_path datasets/STS-B/train.tsv \ --dev_path datasets/STS-B/dev.tsv \ --output_model_path models/sts-b_regression_model.bin \ --epochs_num 5 --batch_size 32 python3 inference/run_regression_infer.py --load_model_path models/sts-b_regression_model.bin \ --vocab_path models/huggingface_gpt2_vocab.txt \ --merges_path models/huggingface_gpt2_merges.txt \ --tokenizer bpe \ --config_path models/xlm-roberta/large_config.json \ --test_path datasets/STS-B/test_nolabel.tsv \ --prediction_path datasets/STS-B/prediction.tsv \ --seq_length 128


【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

      专题文章
        CopyRight 2018-2019 实验室设备网 版权所有