【问题标题】:Keras LSTM + TensorFlow and a number sequence (improve loss)Keras LSTM + TensorFlow 和一个数列(改善损失)
【发布时间】:2018-06-05 19:21:17
【问题描述】:

首先,我使用以下设置运行:

  • 在 Windows 10 上运行
  • Python 3.6.2
  • TensorFlow 1.8.0
  • Keras 2.1.6

我正在尝试预测或至少猜测以下数字序列: https://codepen.io/anon/pen/RJRPPx(限20000条测试),全序列包含约100万条记录。

这是代码(run.py)

import lstm
import time
import matplotlib.pyplot as plt

def plot_results(predicted_data, true_data):
    fig = plt.figure(facecolor='white')
    ax = fig.add_subplot(111)
    ax.plot(true_data, label='True Data')
    plt.plot(predicted_data, label='Prediction')
    plt.legend()
    plt.show()

def plot_results_multiple(predicted_data, true_data, prediction_len):
    fig = plt.figure(facecolor='white')
    ax = fig.add_subplot(111)
    ax.plot(true_data, label='True Data')
    #Pad the list of predictions to shift it in the graph to it's correct start
    for i, data in enumerate(predicted_data):
        padding = [None for p in range(i * prediction_len)]
        plt.plot(padding + data, label='Prediction')
        plt.legend()
    plt.show()

#Main Run Thread
if __name__=='__main__':
    global_start_time = time.time()
    epochs  = 10
    seq_len = 50

    print('> Loading data... ')

    X_train, y_train, X_test, y_test = lstm.load_data('dice_amplified/primeros_20_mil.csv', seq_len, True)

    print('> Data Loaded. Compiling...')

    model = lstm.build_model([1, 50, 100, 1])

    model.fit(
        X_train,
        y_train,
        batch_size = 512,
        nb_epoch=epochs,
        validation_split=0.05)

    predictions = lstm.predict_sequences_multiple(model, X_test, seq_len, 50)
    #predicted = lstm.predict_sequence_full(model, X_test, seq_len)
    #predicted = lstm.predict_point_by_point(model, X_test)        

    print('Training duration (s) : ', time.time() - global_start_time)
    plot_results_multiple(predictions, y_test, 50)

我已尝试:

  • 增加和减少时期。
  • 增加和减少批量大小。
  • 放大数据。

下图表示:

  • 历元 = 10
  • batch_size = 512
  • validation_split = 0.05

同样,据我了解,损失应该随着时代的增加而减少?这似乎没有发生!

Using TensorFlow backend.
> Loading data...
> Data Loaded. Compiling...
> Compilation Time :  0.03000473976135254
Train on 17056 samples, validate on 898 samples
Epoch 1/10
17056/17056 [==============================] - 31s 2ms/step - loss: 29927.0164 - val_loss: 289.8873
Epoch 2/10
17056/17056 [==============================] - 29s 2ms/step - loss: 29920.3513 - val_loss: 290.1069
Epoch 3/10
17056/17056 [==============================] - 29s 2ms/step - loss: 29920.4602 - val_loss: 292.7868
Epoch 4/10
17056/17056 [==============================] - 27s 2ms/step - loss: 29915.0955 - val_loss: 286.7317
Epoch 5/10
17056/17056 [==============================] - 26s 2ms/step - loss: 29913.6961 - val_loss: 298.7889
Epoch 6/10
17056/17056 [==============================] - 26s 2ms/step - loss: 29920.2068 - val_loss: 287.5138
Epoch 7/10
17056/17056 [==============================] - 28s 2ms/step - loss: 29914.0650 - val_loss: 295.2230
Epoch 8/10
17056/17056 [==============================] - 25s 1ms/step - loss: 29912.8860 - val_loss: 295.0592
Epoch 9/10
17056/17056 [==============================] - 28s 2ms/step - loss: 29907.4067 - val_loss: 286.9338
Epoch 10/10
17056/17056 [==============================] - 46s 3ms/step - loss: 29914.6869 - val_loss: 289.3236

有什么建议吗?我该如何改进它?谢谢!

Lstm.py 内容:

import os
import time
import warnings
import numpy as np
from numpy import newaxis
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.models import Sequential

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' #Hide messy TensorFlow warnings
warnings.filterwarnings("ignore") #Hide messy Numpy warnings

def load_data(filename, seq_len, normalise_window):
    f = open(filename, 'rb').read()
    data = f.decode().split('\n')

    sequence_length = seq_len + 1
    result = []
    for index in range(len(data) - sequence_length):
        result.append(data[index: index + sequence_length])

    if normalise_window:
        result = normalise_windows(result)

    result = np.array(result)

    row = round(0.9 * result.shape[0])
    train = result[:int(row), :]
    np.random.shuffle(train)
    x_train = train[:, :-1]
    y_train = train[:, -1]
    x_test = result[int(row):, :-1]
    y_test = result[int(row):, -1]

    x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
    x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))  

    return [x_train, y_train, x_test, y_test]

def normalise_windows(window_data):
    normalised_data = []
    for window in window_data:
        normalised_window = [((float(p) / float(window[0])) - 1) for p in window]
        normalised_data.append(normalised_window)
    return normalised_data

def build_model(layers):
    model = Sequential()

    model.add(LSTM(
        input_shape=(layers[1], layers[0]),
        output_dim=layers[1],
        return_sequences=True))
    model.add(Dropout(0.2))

    model.add(LSTM(
        layers[2],
        return_sequences=False))
    model.add(Dropout(0.2))

    model.add(Dense(
        output_dim=layers[3]))
    model.add(Activation("linear"))

    start = time.time()
    model.compile(loss="mse", optimizer="rmsprop")
    print("> Compilation Time : ", time.time() - start)
    return model

def predict_point_by_point(model, data):
    #Predict each timestep given the last sequence of true data, in effect only predicting 1 step ahead each time
    predicted = model.predict(data)
    predicted = np.reshape(predicted, (predicted.size,))
    return predicted

def predict_sequence_full(model, data, window_size):
    #Shift the window by 1 new prediction each time, re-run predictions on new window
    curr_frame = data[0]
    predicted = []
    for i in range(len(data)):
        predicted.append(model.predict(curr_frame[newaxis,:,:])[0,0])
        curr_frame = curr_frame[1:]
        curr_frame = np.insert(curr_frame, [window_size-1], predicted[-1], axis=0)
    return predicted

def predict_sequences_multiple(model, data, window_size, prediction_len):
    #Predict sequence of 50 steps before shifting prediction run forward by 50 steps
    prediction_seqs = []
    for i in range(int(len(data)/prediction_len)):
        curr_frame = data[i*prediction_len]
        predicted = []
        for j in range(prediction_len):
            predicted.append(model.predict(curr_frame[newaxis,:,:])[0,0])
            curr_frame = curr_frame[1:]
            curr_frame = np.insert(curr_frame, [window_size-1], predicted[-1], axis=0)
        prediction_seqs.append(predicted)
    return prediction_seqs

附录:

根据 nuric 的建议,我将模型修改如下:

def build_model(layers):
    model = Sequential()
    model.add(LSTM(input_shape=(layers[1], layers[0]), output_dim=layers[1], return_sequences=True))
    model.add(Dropout(0.2))
    model.add(LSTM(layers[2], return_sequences=False))
    model.add(Dropout(0.2))
    model.add(Dense(output_dim=layers[3]))
    model.add(Activation("linear"))
    model.add(Dense(64, input_dim=50, activation='relu'))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(1))
    start = time.time()
    model.compile(loss="mse", optimizer="rmsprop")
    print("> Compilation Time : ", time.time() - start)
    return model

在这方面还是有点失落……

【问题讨论】:

    标签: python tensorflow keras sequence lstm


    【解决方案1】:

    即使您对输入进行了规范化,您也不会对输出进行规范化。 LSTM 默认具有tanh 输出,这意味着您将拥有有限的特征空间,即密集层将无法回归到大量数字。

    你有一个固定长度的数字输入(50,),直接将其传递给具有relu 激活的密集层,这样在回归任务上会表现得更好,比如:

    model = Sequential()
    model.add(Dense(64, input_dim=50, activation='relu'))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(1))
    

    对于回归,最好使用 l2 regularizers 而不是 Dropout,因为您并不是真正为分类等进行特征提取。

    【讨论】:

    • 非常感谢@Nuric,但我对此有点陌生......你能用代码示例进一步解释一下吗?我明白了,但我什至不知道从哪里开始。再次感谢!
    • 我建议研究深度学习回归模型,看看它们与分类问题有何不同。您可能不需要深度学习方法,像xgboost 这样的随机森林可能会更好。
    • 谢谢@nuric,问题是我对随机森林一无所知,甚至比深度学习还要多。据我了解,我尝试修改模型,添加您提到的 3 Dense 层,我可能应该删除之前的一些层...关于 xgboost,我将立即开始阅读。
    • 有机会,您知道我可以用作参考的任何好的教程吗?我发现的唯一一个在这里:quantinsti.com/blog/…,它使用一个独特的序列,但它都写在 R 上......使用 python 类似的东西是最好的。
    • 这个post 可能会有所帮助。
    猜你喜欢
    • 1970-01-01
    • 2019-12-10
    • 1970-01-01
    • 2020-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-07-07
    • 2019-01-20
    • 2021-09-06
    相关资源
    最近更新 更多