【发布时间】:2017-08-16 10:01:29
【问题描述】:
这里我想生成一个 LSTM 在 MxNet 中的使用教程,以 Tensorflow 为例。 (位置https://github.com/mouradmourafiq/tensorflow-lstm-regression/blob/master/lstm_sin.ipynb" 这是我的主要代码
import mxnet as mx
import numpy as np
import pandas as pd
import argparse
import os
import sys
from data_processing import generate_data
import logging
head = '%(asctime)-15s %(message)s'
logging.basicConfig(level=logging.DEBUG, format=head)
TIMESTEPS = 3
BATCH_SIZE = 100
X, y = generate_data(np.sin, np.linspace(0, 100, 10000), TIMESTEPS, seperate=False)
train_iter = mx.io.NDArrayIter(X['train'], y['train'], batch_size=BATCH_SIZE, shuffle=True, label_name='lro_label')
eval_iter = mx.io.NDArrayIter(X['val'], y['val'], batch_size=BATCH_SIZE, shuffle=False)
test_iter = mx.io.NDArrayIter(X['test'], batch_size=BATCH_SIZE, shuffle=False)
num_layers = 3
num_hidden = 50
data = mx.sym.Variable('data')
label = mx.sym.Variable('lro_label')
stack = mx.rnn.SequentialRNNCell()
for i in range(num_layers):
stack.add(mx.rnn.LSTMCell(num_hidden=num_hidden, prefix='lstm_l%d_'%i))
#stack.reset()
outputs, states = stack.unroll(length=TIMESTEPS,
inputs=data,
layout='NTC',
merge_outputs=True)
outputs = mx.sym.reshape(outputs, shape=(BATCH_SIZE, -1))
# purpose of fc1 was to make shape change to (batch_size, *), or label shape won't match LSTM unrolled output shape.
outputs = mx.sym.FullyConnected(data=outputs, num_hidden=1, name='fc1')
label = mx.sym.reshape(label, shape=(-1,))
outputs = mx.sym.LinearRegressionOutput(data=outputs,
label=label,
name='lro')
contexts = mx.cpu(0)
model = mx.mod.Module(symbol = outputs,
data_names = ['data'],
label_names = ['lro_label'])
model.fit(train_iter, eval_iter,
optimizer_params = {'learning_rate':0.005},
num_epoch=4,
batch_end_callback=mx.callback.Speedometer(BATCH_SIZE, 2))
此代码运行,但 train_accuracy 为 Nan。 问题是如何使它正确? 并且由于展开的形状有sequence_length,它如何与标签形状匹配?我的 FC1 网络有意义吗?
【问题讨论】:
-
代码一般没有问题,我终于让它工作了。它的运行速度比 TF 慢。不知道为什么。我可能会在 MXNet 教程中发布一个示例作为 LSTM 的开始示例,因为我发现 MXNet 示例通常非常复杂。
-
这对我来说运行良好并且收敛速度超快。
标签: mxnet