加载微调模型以调用预测时遇到问题答案

【问题标题】：Getting trouble in loading fine-tuned model to call predict加载微调模型以调用预测时遇到问题
【发布时间】：2022-06-22 16:08:14
【问题描述】：

我是 tensorflow 和 BERT 的新手，我按照网络上的一些教程通过我自己的数据集在这里微调 DistilBert， https://medium.com/geekculture/hugging-face-distilbert-tensorflow-for-custom-text-classification-1ad4a49e26a7

我的数据集仅包含“消息”和“标签”两列，看起来像， pic1

我成功地训练了模型，并且 predict.proba 函数也运行良好。但是当我保存模型时，我会收到一些警告，例如

WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x000001B910694D88>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x000001B97BBC58C8>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x000001B97BCF0E48>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x000001B91071AB08>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x000001B91072E388>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x000001B91073FC48>, because it is not built.
WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, transformer_layer_call_fn, transformer_layer_call_and_return_conditional_losses, LayerNorm_layer_call_fn while saving (showing 5 of 164). These functions will not be directly callable after loading.

当我加载保存的模型并再次调用预测函数时，我得到了错误。我使用了 keras.models.load_model()、tf.saved_model.load() 和 tf.keras.models.load_model 但仍然出现类似错误，

ValueError: Exception encountered when calling layer "tf_distil_bert_for_sequence_classification" (type TFDistilBertForSequenceClassification).
Could not find matching concrete function to call loaded from the SavedModel.Got:
  Positional arguments (9 total):
    * {'input_ids': <tf.Tensor 'input_ids_1:0' shape=(None, 100) dtype=int32>, 'attention_mask': <tf.Tensor 'input_ids:0' shape=(None, 100) dtype=int32>}
    * None
    * None
    * None
    * None
    * None
    * None
    * None
    * False
  Keyword arguments: {}

 Expected these arguments to match one of the following 2 option(s):

Option 1:
  Positional arguments (9 total):
    * {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}
    * None
    * None
    * None
    * None
    * None
    * None
    * None
    * False
  Keyword arguments: {}

Option 2:
  Positional arguments (9 total):
    * {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}
    * None
    * None
    * None
    * None
    * None
    * None
    * None
    * True
  Keyword arguments: {}

Call arguments received:
  • args=({'input_ids': 'tf.Tensor(shape=(None, 100), dtype=int32)', 'attention_mask': 'tf.Tensor(shape=(None, 100), dtype=int32)'},)
  • kwargs={'training': 'False'}

我很困惑为什么保存的模型不像以前那样工作。有什么建议吗？

这是我的完整代码：

import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
import transformers
from transformers import DistilBertTokenizer
from transformers import TFDistilBertForSequenceClassification
from transformers import TFTrainer, TFTrainingArguments

pd.set_option('display.max_colwidth', None)
BATCH_SIZE = 16
N_EPOCHS = 3

df = pd.read_csv('twitter.csv', names=["message", "label"], encoding='cp949')

X = list(df['message'])
y = list(df['label'])
y = list(pd.get_dummies(y,drop_first=True)[True])

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)

tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')

train_encodings = tokenizer(X_train, truncation=True, padding=True)
test_encodings = tokenizer(X_test, truncation=True, padding=True)

train_dataset = tf.data.Dataset.from_tensor_slices((
    dict(train_encodings),
    y_train
))

test_dataset = tf.data.Dataset.from_tensor_slices((
    dict(test_encodings),
    y_test
))

model = TFDistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")

#chose the optimizer
optimizerr = tf.keras.optimizers.Adam(learning_rate=5e-5)

#define the loss function 
losss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

#build the model
model.compile(optimizer=optimizerr,
              loss=losss,
              metrics=['accuracy'])

history = model.fit(train_dataset.shuffle(len(X_train)).batch(BATCH_SIZE),
          epochs=N_EPOCHS,
          batch_size=BATCH_SIZE)

# model evaluation on the test set
model.evaluate(test_dataset.shuffle(len(X_test)).batch(BATCH_SIZE), 
               return_dict=True, 
               batch_size=BATCH_SIZE)

# tests
def predict_proba(text_list, model, tokenizer):  
    #tokenize the text
    encodings = tokenizer(text_list, 
                          max_length=1000, 
                          truncation=True, 
                          padding=True)
    #transform to tf.Dataset
    dataset = tf.data.Dataset.from_tensor_slices((dict(encodings)))
    #predict
    preds = model.predict(dataset.batch(1)).logits  
    
    #transform to array with probabilities
    res = tf.nn.softmax(preds, axis=1).numpy()      
    
    return res

examples = [
    'In 2008, several failing banks were bailed out partially using taxpayer money. Putting all money at bank provide risk, risk of devaluation, risk of inflation, risk of aggressive centralise policy. Decentralized system like bitcoin working on blockchain provide relief.',
    'Bitcoin is counterfeit. Disagree? Look again.',
    'Did I make a bad GPU purchase before the end of Ethereum mining?',
    '@Mamooetz Help. I created this bot to reply to ETH, BITCOIN, and NFT but I dont know how to shut it off.',
    '@WaldorickWilson Cryptocurrency doesnt have to be cryptic. Luno takes the complexity out of #Bitcoin and lets you buy, store, learn and earn all in one place',
]

result = predict_proba(examples, model, tokenizer)
print(result)

# save model
dataset_name = 'adv'
saved_model_path = './{}_bert'.format(dataset_name.replace('/', '_'))

# model.save(saved_model_path)
tf.saved_model.save(model, saved_model_path)

# load model
loaded_model = tf.saved_model.load(saved_model_path)
inference_function = loaded_model.signatures['serving_default']

reloaded = tf.keras.models.load_model(saved_model_path)
predict_proba(examples, reloaded, tokenizer)

【问题讨论】：

请修剪您的代码，以便更容易找到您的问题。请按照以下指南创建minimal reproducible example。

标签： python tensorflow keras

【解决方案1】：

我也遇到了同样的问题，这个问题你解决了吗？@pcw

【讨论】：