尝试使用 Keras 的 model.predict() 时尺寸错误答案

【问题标题】：Wrong dimensions when trying to use model.predict() from Keras尝试使用 Keras 的 model.predict() 时尺寸错误
【发布时间】：2017-12-18 22:49:21
【问题描述】：

我认为代码会自己说话，但我训练了一个模型，现在我想用它来预测一些新的输入数据。不过，新的输入数据似乎是错误的维度。您可以在下面看到模型和预测（尝试）的代码和错误消息

tokenizer = Tokenizer(num_words=10000)

df = pd.read_csv('/home/paperspace/Sentiment Analysis Dataset.csv', index_col = 0,
                 error_bad_lines = False)

y = list(df['Sentiment'])

tokenizer.fit_on_texts(list(df['SentimentText']))
X = tokenizer.texts_to_sequences(list(df['SentimentText']))
X = pad_sequences(X)

print("Done, fitting on texts.")

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, shuffle = True)

model = Sequential()
#Creates the wordembeddings.
embedding_vector_dim = 32
model.add(Embedding(10000, embedding_vector_dim, input_length=X.shape[1]))
model.add(Dropout(0.2))
model.add(LSTM(128))
model.add(Dropout(0.2))         
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
model.summary()


model.fit(numpy.array(X_train), numpy.array(y_train),
          batch_size=128,
          epochs=1,
          validation_data=(numpy.array(X_test), numpy.array(y_test)))
score, acc = model.evaluate(numpy.array(X_test),numpy.array(y_test),
                            batch_size=128)

model.save('./sentiment_seq.h5')

print('Test score:', score)
print('Test accuracy:', acc)

现在尝试预测和错误消息。

text = "this is actually a very bad movie."
tokenizer = Tokenizer()

tokenizer.fit_on_texts(list(text))
X = tokenizer.texts_to_sequences(list(text))
X = pad_sequences(X)
X_flat = np.array([X.flatten()])


model = load_model('sentiment_test.h5')
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
print(model.predict(X, batch_size = 1, verbose = 1))

ValueError: Error when checking : expected embedding_1_input to have shape (None, 116) but got array with shape (1, 38)

所以基本上为什么我会收到这个错误，在训练和预测时预处理是相同的，在看到错误消息之前我怎么知道预期的输入应该是什么？

【问题讨论】：

标签： python-3.x machine-learning tensorflow deep-learning keras

【解决方案1】：

如果您不使用固定输入长度，则不应在嵌入层中定义input_length。

【讨论】：

我似乎明白了。我假设我在填充句子时使用了固定输入，并且显然必须弄清楚 X.shape[1] 是什么，并且在预测时也将所有新输入填充到该长度。