使用Google免费GPU进行BERT模型fine-tuning

bert

TensorFlow code and pre-trained models for BERT

项目地址：https://gitcode.com/gh_mirrors/be/bert

免费下载资源

LeoWood

6835人浏览 · 2018-11-21 16:30:54

LeoWood · 2018-11-21 16:30:54 发布

使用Google Colab中自带的免费GPU进行BERT fine-tuning。

前期准备

首先，需要申请一个谷歌账号。

打开谷歌云端硬盘，新建一个文件夹，例如：BERT。将代码和数据上传到该文件里。这里的代码应该是已经修改好的代码，具体方法参照上一篇博客，博客最后也有提到，谷歌Colab可以在运行的时候设定参数，因此这里代码里的参数可以保持为默认参数，方便在每次运行的时候修改。

在云盘任意位置新建-更多，选择Colaboratory：

如果在更多栏里没有发现Colaboratory，选择关联更多应用，搜索Colaboratory，选择关联。

创建完成后，会生成一个jupyter笔记本。

选择修改-笔记本设置，将硬件加速器设置为GPU。

程序运行

每次重新打开笔记本运行程序时，都要进行以下几步：

安装相关库以及授权：

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

运行后，会出现链接，点开链接，选择自己的谷歌账号，将得到的代码输入链接下面的框中即可。

在这里插入图片描述

挂载谷歌云端硬盘

!mkdir -p drive
!google-drive-ocamlfuse drive  -o nonempty

每次修改了云盘中的文件以后都最好运行一次，运行了没有任何提示便成功了。

设置工作路径

import os
os.chdir('drive/BERT')

将笔记本的工作路径设置到BERT代码的文件夹下。

可以用ls命令查看是否成功：

在这里插入图片描述

运行run_classifier.py

!python run_classifier.py \
  --task_name=bert_move \
  --do_train=true \
  --do_eval=true \
  --data_dir=data \
  --vocab_file=gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/vocab.txt \
  --bert_config_file=gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/bert_config.json \
  --init_checkpoint=gs://cloud-tpu-checkpoints/bert/uncased_L-24_H-1024_A-16/bert_model.ckpt \
  --max_seq_length=128 \
  --train_batch_size=4 \
  --learning_rate=2e-5 \
  --num_train_epochs=3.0 \
  --output_dir=output \