【问题标题】:LSTM to predict sine waveLSTM 预测正弦波
【发布时间】:2017-08-16 10:01:29
【问题描述】:

这里我想生成一个 LSTM 在 MxNet 中的使用教程,以 Tensorflow 为例。 (位置https://github.com/mouradmourafiq/tensorflow-lstm-regression/blob/master/lstm_sin.ipynb" 这是我的主要代码

import mxnet as mx
import numpy as np
import pandas as pd
import argparse
import os
import sys
from data_processing import generate_data
import logging
head = '%(asctime)-15s %(message)s'
logging.basicConfig(level=logging.DEBUG, format=head)
TIMESTEPS = 3
BATCH_SIZE = 100
X, y = generate_data(np.sin, np.linspace(0, 100, 10000), TIMESTEPS, seperate=False)
train_iter = mx.io.NDArrayIter(X['train'], y['train'], batch_size=BATCH_SIZE, shuffle=True, label_name='lro_label')
eval_iter = mx.io.NDArrayIter(X['val'], y['val'], batch_size=BATCH_SIZE, shuffle=False)
test_iter = mx.io.NDArrayIter(X['test'], batch_size=BATCH_SIZE, shuffle=False)
num_layers = 3
num_hidden = 50

data = mx.sym.Variable('data')
label = mx.sym.Variable('lro_label')

stack = mx.rnn.SequentialRNNCell()
for i in range(num_layers):
    stack.add(mx.rnn.LSTMCell(num_hidden=num_hidden, prefix='lstm_l%d_'%i))
#stack.reset()
outputs, states = stack.unroll(length=TIMESTEPS,
                               inputs=data,
                               layout='NTC',
                               merge_outputs=True)

outputs = mx.sym.reshape(outputs, shape=(BATCH_SIZE, -1))
# purpose of fc1 was to make shape change to (batch_size, *), or label shape won't match LSTM unrolled output shape.
outputs = mx.sym.FullyConnected(data=outputs, num_hidden=1, name='fc1')
label = mx.sym.reshape(label, shape=(-1,))
outputs = mx.sym.LinearRegressionOutput(data=outputs, 
                               label=label,
                               name='lro')
contexts = mx.cpu(0)
model = mx.mod.Module(symbol = outputs,
                     data_names = ['data'],
                     label_names = ['lro_label'])
model.fit(train_iter, eval_iter,
         optimizer_params = {'learning_rate':0.005},
         num_epoch=4,
         batch_end_callback=mx.callback.Speedometer(BATCH_SIZE, 2))

此代码运行,但 train_accuracy 为 Nan。 问题是如何使它正确? 并且由于展开的形状有sequence_length,它如何与标签形状匹配?我的 FC1 网络有意义吗?

【问题讨论】:

  • 代码一般没有问题,我终于让它工作了。它的运行速度比 TF 慢。不知道为什么。我可能会在 MXNet 教程中发布一个示例作为 LSTM 的开始示例,因为我发现 MXNet 示例通常非常复杂。
  • 这对我来说运行良好并且收敛速度超快。

标签: mxnet


【解决方案1】:

auto_reset=False 传递给Speedometer 回调,例如batch_end_callback=mx.callback.Speedometer(BATCH_SIZE, 2, auto_reset=False),应该会修复NaN train-acc。

【讨论】:

    猜你喜欢
    • 2023-03-28
    • 2019-09-18
    • 2017-04-10
    • 1970-01-01
    • 2011-01-15
    • 1970-01-01
    • 1970-01-01
    • 2011-05-31
    • 2021-06-04
    相关资源
    最近更新 更多