具有可变句子长度的 Keras LSTM——传递给模型的 Numpy 数组列表不是预期的模型大小答案

【问题标题】：Keras LSTM with Variable Sentences Length -- the list of Numpy arrays passing to model not size model expected具有可变句子长度的 Keras LSTM——传递给模型的 Numpy 数组列表不是预期的模型大小
【发布时间】：2019-04-27 20:35:09
【问题描述】：

我已经成功实现了一个用于生成音乐的 LSTM 教程。但是，我正在努力为语言创建一个（我的主要兴趣）。我有一个单词索引，这是我数据中的两个示例句子。

样本预测器：

[[1],
 [1, 6],
 [1, 6, 241],
 [1, 6, 241, 252],
 [1, 6, 241, 252, 11],
 [1, 6, 241, 252, 11, 59],
 [1, 6, 241, 252, 11, 59, 2],
 [1, 6, 241, 252, 11, 59, 2, 62],
 [1, 6, 241, 252, 11, 59, 2, 62, 663],
 [1, 6, 241, 252, 11, 59, 2, 62, 663, 41],
 [1],
 [1, 3],
 [1, 3, 216],
 [1, 3, 216, 227],
 [1, 3, 216, 227, 26],
 [1, 3, 216, 227, 26, 30],
 [1, 3, 216, 227, 26, 30, 5]]

示例标签：

[[6],
[241],
[252],
[11],
[59],
[2],
[62],
[663],
[41],
[1],
[3],
[216],
[227],
[26],
[30],
[5],
[1]]

损失应该这样计算：我想最小化。

我的 LSTM 代码是

from keras.models import Model
from keras import layers
from keras import Input

vocabulary_size = len(word_index)
dimensions = 200


text_input = Input(shape=(None,))
embedded = layers.Embedding(vocabulary_size, dimensions)(text_input)
encoded = layers.LSTM(vocabulary_size)(embedded)
output = layers.Dense(vocabulary_size, activation='softmax')(encoded)
model = Model(text_input, output)
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['acc'])
model.summary()
model.fit(x, y, epochs=10, batch_size=1)

为了适应可变的句子长度，我设置了

batch_size = 1 因为句子长度可变
shape 的 Input 到 (None, )

但是，我收到以下错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-64-95228d843a72> in <module>()
     25               metrics=['acc'])
     26 model.summary()
---> 27 model.fit(x, y, epochs=1, batch_size=1)

C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
    950             sample_weight=sample_weight,
    951             class_weight=class_weight,
--> 952             batch_size=batch_size)
    953         # Prepare validation data.
    954         do_validation = False

C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    749             feed_input_shapes,
    750             check_batch_axis=False,  # Don't enforce the batch size.
--> 751             exception_prefix='input')
    752 
    753         if y is not None:

C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    100                 'Expected to see ' + str(len(names)) + ' array(s), '
    101                 'but instead got the following list of ' +
--> 102                 str(len(data)) + ' arrays: ' + str(data)[:200] + '...')
    103         elif len(names) > 1:
    104             raise ValueError(

ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 17 arrays: [array([[1]]), array([[1],
       [6]]), array([[  1],
       [  6],
       [241]]), array([[  1],
       [  6],
       [241],
       [252]]), array([[  1],
       [  6],
       [241],
       [252],
 ...

我没有使用列表列表，而是尝试将它们转换为 numpy 数组列表，但这并没有改变错误。这是在这里建议的： keras list of Numpy arrays not the size model expected

x = [np.array(i) for i in x]
y = [np.array(i) for i in y]

即使我有意构建模型以处理不同长度的数组，为什么会出现此错误？

由于我的预测变量 (x) 的格式而发生错误。至少，我认为错误表明了这一点。

【问题讨论】：

你是怎么解决这个问题的？

标签： python keras lstm

【解决方案1】：

我可能记错了，但我相信你应该这样做

y = y.reshape(-1)

在这里获得 Keras 似乎想要的标签的平面列表。使用 batch_size 为 1 的训练也会影响你的表现。我建议使用固定数量的 0s 预先填充您的数据，在 LSTM 中设置 mask=True 并使用标准批量大小

【讨论】：

虽然我相信错误是在输入上发生的。
x 的输入形状是什么。对于 keras，它应该类似于 (samples, timesteps, 1)。如果它是二维的，你应该扩展它x.reshape(x.shape + (1,))。它看起来在错误中是正确的
这里的输入形状是可变的。我最初尝试使用填充进行预处理。这是我第一次构建 LSTM，而且我的模型过度拟合了很多（比如 30% 的准确率和 6% 的验证）。所以我切换到这种具有不同数组长度的方法。预处理标准吗？我很困惑为什么博士级别的人不能就最好的方法达成一致。
我的 x 列表中有大约 58635 个列表
@StanShunpike 您应该明确地预先填充，以便您的渐变可以与 > 1 个批次更加一致。我怀疑否则很难学习。您需要做的就是在嵌入层中mask_zero=True 并预先填充输入。