【问题标题】:How to pass `None` Batch Size to tensorflow dynamic rnn?如何将`None` Batch Size传递给tensorflow动态rnn?
【发布时间】:2017-12-14 05:12:47
【问题描述】:

我正在尝试构建一个CNN+LSTM+CTC 用于单词识别的模型。
最初我有一个图像,我正在转换使用CNN 在文字图像上提取的特征,并构建一系列特征,我将其作为顺序数据传递给RNN

以下是我将特征转换为顺序数据的方式:
[[a1,b1,c1],[a2,b2,c2],[a3,b3,c3]] -> [[a1,a2,a3],[b1,b2,b3],[c1,c2,c3]]
其中a,b,c 是使用CNN 提取的3 个特征。
目前我可以将常量batch_size 传递给模型common.BATCH_SIZE,但我想要的是能够将变量batch_size 传递给模型。
这是怎么做到的?

inputs = tf.placeholder(tf.float32, [common.BATCH_SIZE, common.OUTPUT_SHAPE[1], common.OUTPUT_SHAPE[0], 1])
# Here we use sparse_placeholder that will generate a
# SparseTensor required by ctc_loss op.
targets = tf.sparse_placeholder(tf.int32)

# 1d array of size [batch_size]
seq_len = tf.placeholder(tf.int32, [common.BATCH_SIZE])

model = tf.layers.conv2d(inputs, 64, (3,3),strides=(1, 1), padding='same', name='c1')
model = tf.layers.max_pooling2d(model, (3,3), strides=(2,2), padding='same', name='m1')
model = tf.layers.conv2d(model, 128,(3,3), strides=(1, 1), padding='same', name='c2')
model = tf.layers.max_pooling2d(model, (3,3),strides=(2,2), padding='same', name='m2')
model = tf.transpose(model, [3,0,1,2])
shape = model.get_shape().as_list()
model = tf.reshape(model, [shape[0],-1,shape[2]*shape[3]])

cell = tf.nn.rnn_cell.LSTMCell(common.num_hidden, state_is_tuple=True)
cell = tf.nn.rnn_cell.DropoutWrapper(cell, input_keep_prob=0.5, output_keep_prob=0.5)
stack = tf.nn.rnn_cell.MultiRNNCell([cell]*common.num_layers, state_is_tuple=True)

outputs, _ = tf.nn.dynamic_rnn(cell, model, seq_len, dtype=tf.float32,time_major=True)



更新:

batch_size = tf.placeholder(tf.int32, None, name='batch_size')

inputs = tf.placeholder(tf.float32, [batch_size, common.OUTPUT_SHAPE[1], common.OUTPUT_SHAPE[0], 1])
# Here we use sparse_placeholder that will generate a
# SparseTensor required by ctc_loss op.
targets = tf.sparse_placeholder(tf.int32)

# 1d array of size [batch_size]
seq_len = tf.placeholder(tf.int32, [batch_size])

model = tf.layers.conv2d(inputs, 64, (3,3),strides=(1, 1), padding='same', name='c1')
model = tf.layers.max_pooling2d(model, (3,3), strides=(2,2), padding='same', name='m1')
model = tf.layers.conv2d(model, 128,(3,3), strides=(1, 1), padding='same', name='c2')
model = tf.layers.max_pooling2d(model, (3,3),strides=(2,2), padding='same', name='m2')
model = tf.transpose(model, [3,0,1,2])
shape = model.get_shape().as_list()
model = tf.reshape(model, [shape[0],-1,shape[2]*shape[3]])

cell = tf.nn.rnn_cell.LSTMCell(common.num_hidden, state_is_tuple=True)
cell = tf.nn.rnn_cell.DropoutWrapper(cell, input_keep_prob=0.5, output_keep_prob=0.5)
stack = tf.nn.rnn_cell.MultiRNNCell([cell]*common.num_layers, state_is_tuple=True)

outputs, _ = tf.nn.dynamic_rnn(cell, model, seq_len, dtype=tf.float32,time_major=True)


我收到如下错误:

    Traceback (most recent call last):
  File "lstm_and_ctc_ocr_train.py", line 203, in <module>
    train()
  File "lstm_and_ctc_ocr_train.py", line 77, in train
    logits, inputs, targets, seq_len, batch_size = model.get_train_model()
  File "/home/himanshu/learning-tf/tf/code/tensorflow_lstm_ctc_ocr/model.py", line 20, in get_train_model
inputs = tf.placeholder(tf.float32, [batch_size, common.OUTPUT_SHAPE[1], common.OUTPUT_SHAPE[0], 1])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 705, in apply_op
    attr_value.shape.CopyFrom(_MakeShape(value, key))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 198, in _MakeShape
    return tensor_shape.as_shape(v).as_proto()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 798, in as_shape
    return TensorShape(shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 434, in __init__
    self._dims = [as_dimension(d) for d in dims_iter]
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 376, in as_dimension
    return Dimension(value)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 32, in __init__
    self._value = int(value)
TypeError: int() argument must be a string or a number, not 'Tensor'

【问题讨论】:

    标签: tensorflow deep-learning conv-neural-network rnn


    【解决方案1】:

    您应该能够将 batch_size 作为占位符传递给动态 RNN。根据我的经验,您可能会遇到的唯一令人头疼的问题是,如果您没有提前指定它的形状,那么您应该通过 [] 来使事情正常工作,如下所示:

    batchsize = tf.placeholder(tf.int32, [], name='batchsize')

    然后以通常的方式在 sess.run() 期间输入它的值。这对我来说很有效,同时训练大批量,但随后以 1 的批量生成。

    但严格来说,您甚至不需要专门为dynamic_rnn 指定批量大小,对吗?如果您使用 MultiRNNCell 来获得零状态,则需要它,但我没有看到您在代码中这样做...

    *** 更新:

    正如评论中所讨论的,您的问题似乎与dynamic_rnn 无关,更多的是与您使用占位符inputs 来指定另一个占位符seq_len 的形状有关。这是重现相同错误的代码:

    import tensorflow as tf
    
    a = tf.placeholder(tf.int32, None, name='a')
    b = tf.placeholder(tf.int32, [a, 5], name='b')
    c = b * 5
    
    with tf.Session() as sess:
        C = sess.run(c, feed_dict={a:1, b:[[1,2,3,4,5]]})
    

    这是错误:

    TypeError: int() argument must be a string, a bytes-like object or a number, not 'Tensor'
    

    在遇到 dynamic_rnn 的麻烦之前,我建议您找到解决此问题的方法,或者通过更改代码或提出一个单独的问题来了解如何使用占位符来捏造它。

    【讨论】:

    • 我尝试了你的建议,我仍然收到更新问题中提到的错误
    • 我必须放置一个固定的batch_size 的原因是因为我正在将图像数据转换为顺序数据,所以我正在这样做 model = tf.reshape(model, [shape[0],-1,shape[2]*shape[3]]) ,所以如果我将 None 放入 @ 987654333@我得到了一些东西,因为你不能将整数与 None 相乘
    • 查看您更新的帖子和错误堆栈,我认为问题与dynamic_rnn无关。我认为问题在于您在另一个占位符(seq_len)的形状规范中提供了一个占位符(输入)。我将更新我的答案,向您展示可以重现此错误的最少代码。我不确定如何解决它,也许您可​​以以某种方式更改代码的逻辑以提供该值。最好专门为此提出一个单独的问题——您将更有可能得到答案。检查我更新的代码答案
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-09-18
    • 2021-11-05
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多