【问题标题】:Generating confusion matrix for keras model - Sentiment analysis为 keras 模型生成混淆矩阵 - 情感分析
【发布时间】:2020-05-08 17:22:52
【问题描述】:

我正在使用 LSTM 测试情绪分析模型。我需要在分类器结果中添加一个混淆矩阵,如果可能的话,还要添加精度、召回率和 F-Measure 值。到目前为止,我只有准确性。 Movie_reviews 数据有 pos 和 neg 标签。

import tensorflow as td
from tensorflow import keras
import numpy as np
from keras.layers import LSTM,Embedding,Dense
from keras.models import Sequential
from keras.preprocessing.sequence import pad_sequences

data = keras.datasets.imdb
(train_data, train_labels), (test_data, test_labels) = data.load_data(num_words=88000)
word_index = data.get_word_index()
word_index = {k:(v+3) for k, v in word_index.items()}
word_index["<PAD>"] = 0
word_index["<START>"] = 1
word_index["<UNK>"] = 2
word_index["<UNUSED>"] = 3
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
train_data = keras.preprocessing.sequence.pad_sequences(train_data, value=word_index["<PAD>"], padding="post", maxlen=250)
maxlen = 250
X_train_pad = pad_sequences(train_data,maxlen=maxlen)
X_test_pad = pad_sequences(test_data,maxlen=maxlen)
max_features = max([max(x) for x in X_train_pad] + 
               [max(x) for x in X_test_pad]) + 1
max_features
model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(64, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
          optimizer='adam',
          metrics=['accuracy'])
model.summary()
x_val = train_data[:10000]
x_train = train_data[10000:]
y_val = train_labels[:10000]
y_train = train_labels[10000:]
fitModel = model.fit(x_train, y_train, epochs=1, batch_size=512, validation_data=(x_val,y_val),verbose=1)
from sklearn.metrics import confusion_matrix
y_pred = model.predict(test_data)
confusion_matrix = confusion_matrix(test_data, np.rint(test_labels))

使用上面的代码生成混淆矩阵,我收到以下错误:

    confusion_matrix = confusion_matrix(test_data, np.rint(test_labels))
  File "/usr/lib/python2.7/dist-packages/sklearn/metrics/classification.py", line 250, in confusion_matrix
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "/usr/lib/python2.7/dist-packages/sklearn/metrics/classification.py", line 81, in _check_targets
    "and {1} targets".format(type_true, type_pred))
ValueError: Classification metrics can't handle a mix of multiclass-multioutput and binary targets

我们如何准确地得到混淆矩阵?

【问题讨论】:

    标签: python tensorflow keras sentiment-analysis confusion-matrix


    【解决方案1】:

    你只需要:

    y_pred = (model.predict(test_data).ravel()>0.5)+0 # predict and get class (0 if pred < 0.5 else 1)
    confusion_matrix = confusion_matrix(test_labels, y_pred)
    

    【讨论】:

      猜你喜欢
      • 2016-11-27
      • 2019-10-20
      • 2020-03-08
      • 2019-05-26
      • 2021-02-25
      • 2020-01-18
      • 2016-02-05
      • 2021-07-29
      • 2019-09-28
      相关资源
      最近更新 更多