【发布时间】:2022-01-05 08:20:05
【问题描述】:
我从 Huggingface 构建了一个 BERT 模型(Bert-base-multilingual-cased),并希望评估该模型的精度、召回率和 F1 分数以及准确度,因为准确度并不总是评估的最佳指标。
Here 是我为我的用例修改的示例笔记本。
创建训练/测试数据:
from transformers import BertTokenizer, TFBertModel, TFBertForSequenceClassification
TEST_SPLIT = 0.1
BATCH_SIZE = 2
train_size = int(len(x) * (1-TEST_SPLIT))
tfdataset = tfdataset.shuffle(len(x))
tfdataset_train = tfdataset.take(train_size)
tfdataset_test = tfdataset.skip(train_size)
tfdataset_train = tfdataset_train.batch(BATCH_SIZE)
tfdataset_test = tfdataset_test.batch(BATCH_SIZE)
构建模型:
MODEL_NAME = 'bert-base-multilingual-cased'
N_EPOCHS = 2
model = TFBertForSequenceClassification.from_pretrained(MODEL_NAME)
optimizer = optimizers.Adam(learning_rate=3e-5)
loss = losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])
model.fit(tfdataset_train, batch_size=BATCH_SIZE, epochs=N_EPOCHS)
示例输出:
All model checkpoint layers were used when initializing TFBertForSequenceClassification.
Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-multilingual-cased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Epoch 1/2
415/415 [==============================] - 741s 2s/step - loss: 0.6652 - accuracy: 0.6321
Epoch 2/2
415/415 [==============================] - 717s 2s/step - loss: 0.6619 - accuracy: 0.6429
<keras.callbacks.History at 0x7fc970d72750>
评估:
benchmarks = model.evaluate(tfdataset_test, return_dict=True, batch_size=BATCH_SIZE)
print(benchmarks)
示例输出:
93/93 [==============================] - 42s 404ms/step - loss: 0.6536 - accuracy: 0.6108
{'loss': 0.6535539627075195, 'accuracy': 0.6108108162879944}
有了这个,我就得到了准确度分数。但是,我想要一份包含所有提到的指标的分类报告。
有人知道如何使用此类“tfdatasets”吗?
提前致谢!
【问题讨论】:
-
由于您使用的是 keras api,因此您可以在代码的指标部分中添加,看看这里:keras.io/api/metrics
标签: python tensorflow machine-learning huggingface-transformers bert-language-model