【问题标题】:Tensorflow / Keras : Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2Tensorflow / Keras:lstm层的输入0与层不兼容:预期ndim = 3,发现ndim = 2
【发布时间】:2021-05-14 10:45:35
【问题描述】:

我正在尝试实现一个联合训练 Keras / Tensorflow 模型,以检测文本文章中的假新闻,但我在使用该模型时遇到了问题。当我尝试运行代码时,出现以下错误:

 ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 50]

还有以下警告:

WARNING:tensorflow:Model was constructed with shape (None, 400) for input Tensor("embedding_input:0", shape=(None, 400), dtype=float32), but it was called on an input with incompatible shape (None,).

直观地说,我理解嵌入层输出应该是形状(无、400、50),但由于某种原因,它只提供了一个 2d 输入,或者该层需要一个 3d 张量,但只提供了一个 2d 张量。但是,我不知道如何修复它,或者如何更改输入/输出形状以使它们匹配。我在这个问题上待了几天。我在 ML 和神经网络领域还是新手。任何建议都非常感谢,在此先感谢您。

使用的型号:

max_words = 2000
max_len = 400
embed_dim = 50
lstm_out = 64
batch_size = 32

def getTextModel():
    model = Sequential()
    model.add(Embedding(max_words, embed_dim, input_length = max_len, input_shape=preprocessed_sample_dataset.element_spec))
    model.add(LSTM(lstm_out))
    model.add(Dense(256))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1, name='out_layer'))
    model.add(Activation('sigmoid'))
return model

模型总结:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 400, 50)           100000    
_________________________________________________________________
lstm (LSTM)                  (None, 64)                29440     
_________________________________________________________________
dense (Dense)                (None, 256)               16640     
_________________________________________________________________
activation (Activation)      (None, 256)               0         
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
out_layer (Dense)            (None, 1)                 257       
_________________________________________________________________
activation_1 (Activation)    (None, 1)                 0         
=================================================================
Total params: 146,337
Trainable params: 146,337
Non-trainable params: 0

其他信息:

数据预处理:

def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict`."""
    print(element['features'])
    return collections.OrderedDict(
        x=element['features'],
        y=tf.reshape(element['label'], [-1, 1])
    )
  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)

preprocessed_sample_dataset = preprocess(sample_dataset)


def make_federated_data(client_data, client_ids):
    return [preprocess(client_data.create_tf_dataset_for_client(x)) for x in client_ids]

federated_train_data = make_federated_data(train_dataset, train_dataset.client_ids)

print('Number of client datasets: {l}'.format(l=len(federated_train_data)))
print('First dataset: {d}'.format(d=federated_train_data[0]))

数据集格式:

Number of client datasets: 4
First dataset: <PrefetchDataset shapes: OrderedDict([(x, (None,)), (y, (None, 1))]), types: OrderedDict([(x, tf.string), (y, tf.int64)])>

调用函数的代码:

def model_fn():

  keras_model = getTextModel() #create_keras_model()
  input_spec_aux = preprocessed_sample_dataset.element_spec
  return tff.learning.from_keras_model(
      keras_model,
      input_spec= input_spec_aux,
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

#Error occurs in iterative_process
iterative_process = tff.learning.build_federated_averaging_process(
    model_fn,
    client_optimizer_fn=lambda: tf.keras.optimizers.Adam(learning_rate=client_lr),
    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=server_lr))

print(str(iterative_process.initialize.type_signature))

state = iterative_process.initialize()

【问题讨论】:

    标签: python tensorflow machine-learning keras tensorflow-federated


    【解决方案1】:

    数据集格式表示输入 x 的形状为 (None,) (ndim/rank, = 1) 和 dtype tf.string)None 来自数据集可能产生不“完整”的批次这一事实,因此实际上第一个维度在[1, BATCH_SIZE] 范围内。这种形状意味着我们有一批单标量字符串。这可能就是问题所在,通常在 LSTM 中,我们需要成批的 sequences 字符串,例如像(None, SEQUENCE_LENGTH) 这样的形状。

    嵌入层会将最后一个维度投影到嵌入维度z,例如形成一个形状(x, y) 并产生一个形状(x, y, z)。所以我们在嵌入层之后的输入将是(None, 50)(或ndim/rank = 2)。回想一下 LSTM 想要序列,而 Keras 想要批次,错误消息是说所需的形状是 (None, SEQUENCE_LENGTH, 50) (ndim/rank = 3)。

    我建议回到数据集并确定element['features'] 的格式是什么。在这种情况下,它可能是一个完整的句子,需要被标记为一个单词序列(例如,用于空格分隔的英语)。

    警告一句:即使在修复了形状之后,我怀疑 Keras 接下来会抱怨 tf.string 的 dtype 不能在嵌入层中使用。序列首先需要转换为整数 id,可能使用来自 tf.lookup 或来自 tf_text 的东西。

    一些可能有用的资源:

    【讨论】:

      猜你喜欢
      • 2020-10-05
      • 2022-08-15
      • 2020-06-12
      • 1970-01-01
      • 1970-01-01
      • 2019-06-04
      • 2019-08-14
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多