【问题标题】:CNN RNN integration for images图像的 CNN RNN 集成
【发布时间】:2023-03-28 17:12:01
【问题描述】:

我正在尝试通过以下代码为 MNIST 图像集成 CNN 和 LSTM:

from __future__ import division, print_function, absolute_import
import tensorflow as tf
import tflearn
import numpy as np
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.estimator import regression

import tflearn.datasets.mnist as mnist
height = 128
width = 128
X, Y, testX, testY = mnist.load_data(one_hot=True)
X = X.reshape([-1, 28, 28, 1])
testX = testX.reshape([-1, 28, 28, 1])

# Building convolutional network
network = tflearn.input_data(shape=[None, 28, 28,1], name='input')
network = tflearn.conv_2d(network, 32, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = tflearn.conv_2d(network, 64, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = fully_connected(network, 128, activation='tanh')
network = dropout(network, 0.8)
network = fully_connected(network, 256, activation='tanh')
network = dropout(network, 0.8)
network = tflearn.reshape(network, [-1, 1, 28*28])
#lstm
network = tflearn.lstm(network, 128, return_seq=True)
network = tflearn.lstm(network, 128)
network = tflearn.fully_connected(network, 10, activation='softmax')
network = tflearn.regression(network, optimizer='adam',
                     loss='categorical_crossentropy', name='target')

#train
model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit(X, Y, n_epoch=1, validation_set=0.1, show_metric=True,snapshot_step=100)

CNN 接受 4D 张量,LSTM 接受 3D。因此,我通过以下方式重塑了网络:network = tflearn.reshape(network, [-1, 1, 28*28])

但运行时出现错误:

InvalidArgumentError(参见上面的回溯):reshape 的输入是 具有 16384 个值的张量,但请求的形状需要倍数 784 [[节点:重塑/重塑 = 重塑 [T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Dropout_1/cond/Merge, Reshape/Reshape/shape)]]

我不清楚为什么他们需要大小为 16384 的张量,即使我硬编码 128*128 它仍然不起作用!我根本无法继续。

【问题讨论】:

  • 为什么要把最后一个 dropout 层改成[-1, 1, 28*28]
  • LSTM 的输入需要 3D 张量
  • 您误解了错误消息。该输出形状与输入形状不兼容。请注意,28*28 = 784。
  • 那么正确的形状应该是什么?我是新手

标签: python-3.x tensorflow conv-neural-network lstm tflearn


【解决方案1】:

错误在这一行:

network = tflearn.reshape(network, [-1, 1, 28*28])

之前的 FC 层有n_units=256,因此它不能重新整形为28*28。将此行更改为:

network = tflearn.reshape(network, [-1, 1, 256])

请注意,您将 CNN 生成的特征,而不是输入的 MNIST图像 提供给 LSTM。

【讨论】:

    猜你喜欢
    • 2018-02-21
    • 1970-01-01
    • 1970-01-01
    • 2019-03-15
    • 2019-06-07
    • 1970-01-01
    • 2019-04-28
    • 2016-02-17
    • 2020-12-13
    相关资源
    最近更新 更多