【发布时间】:2020-11-30 12:43:28
【问题描述】:
我正在使用此代码来训练 Bert 使用 2 个标签进行土耳其语模型分类。但是当我运行以下代码时:
import numpy as np
import pandas as pd
df = pd.read_excel (r'preparedDataNoId.xlsx')
df = df.sample(frac = 1)
from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(df, test_size=0.10)
print('train shape: ',train_df.shape)
print('test shape: ',test_df.shape)
train_df["text"]=train_df["text"].apply(lambda r: str(r))
train_df['label']=train_df['label'].astype(int)
from simpletransformers.classification import ClassificationModel
model = ClassificationModel('bert', 'dbmdz/bert-base-turkish-uncased', use_cuda=False,num_labels=2,
args={'reprocess_input_data': True, 'overwrite_output_dir': True, 'num_train_epochs': 3, "train_batch_size": 64 , "fp16":False, "output_dir": "bert_model"})
model.train_model(train_df)
需要很多时间,它不会停止并且屏幕一直显示:
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
【问题讨论】:
标签: python machine-learning bert-language-model