TypeError：扫描'scan_fn'的内部图表中的不一致......'TensorType（float64，col）'和'TensorType（float64，matrix）'答案

【问题标题】：TypeError: Inconsistency in the inner graph of scan 'scan_fn' .... 'TensorType(float64, col)' and 'TensorType(float64, matrix)'TypeError：扫描'scan_fn'的内部图表中的不一致......'TensorType（float64，col）'和'TensorType（float64，matrix）'
【发布时间】：2016-08-05 04:35:00
【问题描述】：

当我尝试运行我的 LSTM 程序（对于可变长度输入）时，我收到以下错误。

TypeError：扫描'scan_fn'的内部图表不一致：一个输入和输出与相同的循环状态相关联，并且应该具有相同的类型，但类型为 'TensorType(float64, col)' 和 'TensorType(float64, matrix)' 分别。

我的程序基于 imdb 情感分析问题的 LSTM 示例，如下所示：http://deeplearning.net/tutorial/lstm.html。我的数据不是 imdb 的，而是传感器数据。

我分享了我的源代码：lstm_var_length.py 和数据：data.npz。（点击文件）

从上面的错误和一些谷歌搜索让我明白我的函数中的向量/矩阵维度存在一些问题。以下是出现此问题的函数定义：

def lstm_layer(shared_params, input_ex, options):
"""
LSTM Layer implementation. (Variable Length inputs)

Parameters
----------
shared_params: shared model parameters W, U, b etc
input_ex: input example (say dimension: 36 x 100 i.e 36 features and 100 time units)
options: Neural Network model options

Output / returns
----------------
output of each lstm cell [h_0, h_1, ..... , h_t]
"""

def slice(param, slice_no, height):
    return param[slice_no*height : (slice_no+1)*height, :]

def cell(wxb, ht_1, ct_1):
    pre_activation = tensor.dot(shared_params['U'], ht_1)
    pre_activation += wxb

    height = options['hidden_dim']
    ft = tensor.nnet.sigmoid(slice(pre_activation, 0, height))
    it = tensor.nnet.sigmoid(slice(pre_activation, 1, height))
    c_t = tensor.tanh(slice(pre_activation, 2, height))
    ot = tensor.nnet.sigmoid(slice(pre_activation, 3, height))

    ct = ft * ct_1 + it * c_t
    ht = ot * tensor.tanh(ct)

    return ht, ct

wxb = tensor.dot(shared_params['W'], input_ex) + shared_params['b']
num_frames = input_ex.shape[1]
result, updates = theano.scan(cell,
                              sequences=[wxb.transpose()],
                              outputs_info=[tensor.alloc(numpy.asarray(0., dtype=floatX),
                                                         options['hidden_dim'], 1),
                                            tensor.alloc(numpy.asarray(0., dtype=floatX),
                                                         options['hidden_dim'], 1)],
                              n_steps=num_frames)

return result[0]  # only ht is needed


def build_model(shared_params, options):
"""
Build the complete neural network model and return the symbolic variables

Parameters
----------
shared_params: shared, model parameters W, U, b etc
options: Neural Network model options

return
------
x, y, f_pred_prob, f_pred, cost
"""

x = tensor.matrix(name='x', dtype=floatX)
y = tensor.iscalar(name='y') # tensor.vector(name='y', dtype=floatX)

num_frames = x.shape[1]
# lstm outputs from each cell
lstm_result = lstm_layer(shared_params, x, options)
# mean pool from the lstm cell outputs
pool_result = lstm_result.sum(axis=1)/(1. * num_frames)
# Softmax / Logistic Regression
pred = tensor.nnet.softmax(tensor.dot(shared_params['softmax_W'], pool_result) +
                           shared_params['softmax_b'])
# predicted probability function
theano.printing.debugprint(pred)
f_pred_prob = theano.function([x], pred, name='f_pred_prob', mode='DebugMode') # 'DebugMode' <-- Problem seems to occur at this point
# predicted class
f_pred = theano.function([x], pred.argmax(axis=0), name='f_pred')
# cost of the model: -ve log likelihood
offset = 1e-8   # an offset to prevent log(0)
cost = -tensor.log(pred[y-1, 0] + offset)    # y = 1,2,...n but indexing is 0,1,..(n-1)

return x, y, f_pred_prob, f_pred, cost

上述错误是在尝试编译f_pred_prob theano函数时引起的。

异常和调用堆栈如下：

File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 450, in 
    main()
  File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 447, in main
  train_lstm(model_options, train, valid)
 File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 314, in train_lstm
  (x, y, f_pred_prob, f_pred, cost) = build_model(shared_params, options)
File "/home/inblueswithu/Documents/Theano_Trails/lstm_var_length.py", line 95, in build_model
  f_pred_prob = theano.function([x], pred, name='f_pred_prob', mode='DebugMode') # 'DebugMode'
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 320, in function
  output_keys=output_keys)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 479, in pfunc
  output_keys=output_keys)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1777, in orig_function
  defaults)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/debugmode.py", line 2571, in create
  storage_map=storage_map)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 690, in make_thunk
  storage_map=storage_map)[:3]
File "/usr/local/lib/python2.7/dist-packages/theano/compile/debugmode.py", line 1809, in make_all
  no_recycling)
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 730, in make_thunk
  self.validate_inner_graph()
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 249, in validate_inner_graph
  (self.name, type_input, type_output))
TypeError: Inconsistency in the inner graph of scan 'scan_fn' : an input and an output are associated with the same recurrent state and should have the same type but have type 'TensorType(float64, col)' and 'TensorType(float64, matrix)' respectively.

我已经做了一个星期的所有调试，但找不到问题。我怀疑 theano.scan 中的 outputs_info 的初始化是问题，但是当我删除第二维 (1) 时，即使在到达 f_pred_prob 函数（靠近 lstm_result ）。我不确定问题出在哪里。

通过将数据文件与 python 源文件放在同一目录中简单地执行该程序可以重现此问题。

请帮帮我。

感谢和问候，无忧无虑

【问题讨论】：

标签： python machine-learning theano deep-learning theano.scan

【解决方案1】：

使用

outputs_info=[tensor.unbroadcast(tensor.alloc(numpy.asarray(0., dtype=floatX),
                                              options['hidden_dim'], 1),1),
              tensor.unbroadcast(tensor.alloc(numpy.asarray(0., dtype=floatX),
                                              options['hidden_dim'], 1),1)]

而不是原来的输出信息。

这是因为tensor.alloc(numpy.asarray(0., dtype=floatX),options['hidden_dim'], 1) 的第二个dim 为1，然后theano 自动使其可广播，并将张量变量包装为col 而不是矩阵。这是错误消息中的'TensorType(float64, col)'

TypeError: Inconsistency in the inner graph of scan 'scan_fn' : an input and an output are associated with the same recurrent state and should have the same type but have type 'TensorType(float64, col)' and 'TensorType(float64, matrix)' respectively.

而theano.unbroadcast 避免了这个问题。

【讨论】：

【解决方案2】：

我想，我找到了问题所在。我不得不重新检查矩阵的所有维度。我仍然需要仔细检查我的代码。完成后，我将放置新代码。

谢谢。

【讨论】：