ValueError：logits 和标签必须具有相同的形状 ((None, 10, 82) vs (None, 1))答案

【问题标题】：ValueError: logits and labels must have the same shape ((None, 10, 82) vs (None, 1))ValueError：logits 和标签必须具有相同的形状 ((None, 10, 82) vs (None, 1))
【发布时间】：2021-01-21 19:09:08
【问题描述】：

我正在训练一个 LSTM 网络，但我有一个错误

ValueError: logits and labels must have the same shape ((None, 10, 82) vs (None, 1))

我不知道输入形状中的错误来自哪里。任何帮助将不胜感激。谢谢！

# The next step is to split training and testing data. For this we will use sklearn function train_test_split().
features_train, features_test, labels_train, labels_test = train_test_split(features, labels, test_size=0.2)

# features and labels shape
features_train = features_train.reshape(len(features_train), 1, features_train.shape[1])

features_train.shape

(180568, 1, 82)

model = Sequential()
model.add(LSTM(10, input_shape=(features_train.shape[1:])))
model.add(Embedding(180568, 82))
model.add(Dense(67, activation='softmax'))
model.add(Dropout(0.2))
model.add(Activation('sigmoid'))
model.build()

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_3 (LSTM)                (None, 10)                3720      
_________________________________________________________________
embedding_3 (Embedding)      (None, 10, 82)            14806576  
_________________________________________________________________
dropout_3 (Dropout)          (None, 10, 82)            0         
_________________________________________________________________
activation_3 (Activation)    (None, 10, 82)            0         
=================================================================
Total params: 14,810,296
Trainable params: 14,810,296
Non-trainable params: 0

history = model.fit(features_train,
                    labels_train,
                    epochs=15,
                    batch_size=128,
                    validation_data=(features_test, labels_test))

【问题讨论】：

标签： python tensorflow keras

【解决方案1】：

删除整形操作
将嵌入层放在 LSTM 层之前
在输出层之后移除 dropout（这对我来说毫无意义）
在 softmax 激活后移除 sigmoid 激活

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import *

model = Sequential()
model.add(Embedding(180568, 82))
model.add(LSTM(10))
model.add(Dense(67, activation='softmax'))
# model.add(Dropout(0.2))
# model.add(Activation('sigmoid'))
model.build()

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

model(tf.random.uniform((10, 82))).shape

TensorShape([10, 67])

将您的损失函数更改为"sparse_categorical_crossentropy"，因为您的标签似乎是整数。

【讨论】：

我按照您的建议尝试了 implementationint，但这导致我出现另一个错误 InvalidArgumentError: indices[121,51] = -1 is not in [0, 180568) [[node sequential/embedding/embedding_lookup (defined at <ipython-input-55-315ed48de10a>:5) ]] [Op:__inference_train_function_5301] Errors may have originated from an input operation. Input Source operations connected to node sequential/embedding/embedding_lookup: sequential/embedding/embedding_lookup/4127 (defined at /home/jpandeinge/anaconda3/lib/python3.7/contextlib.py:112) Function call stack: train_function
你应该看看输入嵌入层需要的那种类型。在这里无法真正帮助您，因为这是我们这个问题的范围。随意问一个新的智慧，但我会从查看嵌入层的文档开始