训练数据所花费的时间。长短期记忆体答案

【问题标题】：Time taking for training data. LSTM训练数据所花费的时间。长短期记忆体
【发布时间】：2019-08-15 10:33:12
【问题描述】：

我的数据集包含 1000 个用户的 10 天数据。我正在为每个用户训练和测试数据，以获得更好的预测准确性。问题是第一个用户训练 100 个 epoch 需要 5 secs，而对于第 100 个用户，100 个 epoch 需要超过 five minutes。每个用户的训练时间都在增加。如何减少训练时间> 由于位置点是分类的，因此对位置点进行了一次热编码。

list = list_users[:100]
with open("accuracy_Lstm.csv","w") as f:
    f.write('user,LSTM \n')
    for user in list:
        user_data = newdataframe[newdataframe.user==user]
        encoded=encoding(user_data)
        X_train = []
        y_train = []
        for i in range(1, len(encoded)-96):
            X_train.append(encoded[i-1])
            y_train.append(encoded[i])
        X_train, y_train = np.array(X_train), np.array(y_train)
        X_test = encoded[-192:-96,:]
        X_true = encoded[-96:,:]
        X_trainL=X_train.reshape(X_train.shape[0],1,X_train.shape[1])
        time_steps = 1
    #Lstm
        model = Sequential()
        model.add(LSTM(X_train.shape[1], input_shape=(time_steps,X_train.shape[1]), activation='relu'))
        model.add(Dense(X_train.shape[1]))
        model.compile(loss='mse', optimizer='adam')
        model.fit(X_trainL, y_train, batch_size=96, epochs=100, verbose =1)
        model.summary()

        X_testL=X_test.reshape(X_test.shape[0],1,X_test.shape[1])

        pedL =one_hot_decode(model.predict(X_testL))
        true=one_hot_decode(X_true)
        try:
            accuracy = ((sum(x == y for x, y in zip(pedL, true)))/(len(pedL)))*100
        except ZeroDivisionError:
            accuracy = 0
        f.write(' %d,  %f \n'%(user, accuracy))

如何减少用户的培训时间？

【问题讨论】：

标签： python tensorflow machine-learning deep-learning lstm

【解决方案1】：

问题是您在 for 循环的每次迭代中都重新创建了一个新模型。这对于每个模型来说都是非常占用内存的，应该避免。这就是为什么第一个模型训练得非常快，然后每个新模型训练得慢。即使是每次迭代的del model 也无济于事，因为内存泄漏是在张量流中。您可能会在 for 循环开始时清除会话，但这本身非常慢。

我建议你在循环之外创建模型，然后在每次迭代时重新启动模型的权重（因为模型架构在迭代之间不会改变）

编辑：

正如评论中所建议的，这里是如何实现这个想法的示例代码。由于问题本身不是工作代码，因此我尚未测试并验证以下代码是否可以在没有内存泄漏的情况下工作，但根据我以前的经验，我认为它应该。

list = list_users[:100]

def get_model():
    model = Sequential()
    model.add(LSTM(X_train.shape[1], input_shape=(time_steps,X_train.shape[1]), activation='relu'))
    model.add(Dense(X_train.shape[1]))
    model.compile(loss='mse', optimizer='adam')
    model.summary()
    return(model, model.get_weights())

with open("accuracy_Lstm.csv","w") as f:
    f.write('user,LSTM \n')
    model, initial_weights = get_model()
    for user in list:
        user_data = newdataframe[newdataframe.user==user]
        encoded=encoding(user_data)
        X_train = []
        y_train = []
        for i in range(1, len(encoded)-96):
            X_train.append(encoded[i-1])
            y_train.append(encoded[i])
        X_train, y_train = np.array(X_train), np.array(y_train)
        X_test = encoded[-192:-96,:]
        X_true = encoded[-96:,:]
        X_trainL=X_train.reshape(X_train.shape[0],1,X_train.shape[1])
        time_steps = 1
    #Lstm
        model.set_weights(initial_weights)
        model.fit(X_trainL, y_train, batch_size=96, epochs=100, verbose =1)

        X_testL=X_test.reshape(X_test.shape[0],1,X_test.shape[1])

        pedL =one_hot_decode(model.predict(X_testL))
        true=one_hot_decode(X_true)
        try:
            accuracy = ((sum(x == y for x, y in zip(pedL, true)))/(len(pedL)))*100
        except ZeroDivisionError:
            accuracy = 0
        f.write(' %d,  %f \n'%(user, accuracy))

EDIT2：

由于每个用户的模型形状都在变化，因此您需要在每次迭代时创建模型。因此上述解决方案将不起作用。您可以通过tf.keras.backend.clear_session() 清除 tensorflow 内存，但速度很慢（这就是为什么我在上面的解决方案中试图避免它）。对于每个用户，以下解决方案可能会慢于 5 秒（由于增加了清除时间），但对于每个使用的用户来说应该是恒定的，与您拥有的用户数量无关。

list = list_users[:100]

def get_model(input_size):
    # clear the memory of tensorflow before creating a new model
    tf.keras.backend.clear_session()
    # creare new model
    model = Sequential()
    model.add(LSTM(input_size, input_shape=(time_steps,input_size), activation='relu'))
    model.add(Dense(input_size))
    model.compile(loss='mse', optimizer='adam')
    model.summary()
    return(model)

with open("accuracy_Lstm.csv","w") as f:
    f.write('user,LSTM \n')
    for user in list:
        # clear the memory of tensorflow at each new model
        tf.keras.backend.clear_session()

        user_data = newdataframe[newdataframe.user==user]
        encoded=encoding(user_data)
        X_train = []
        y_train = []
        for i in range(1, len(encoded)-96):
            X_train.append(encoded[i-1])
            y_train.append(encoded[i])
        X_train, y_train = np.array(X_train), np.array(y_train)
        X_test = encoded[-192:-96,:]
        X_true = encoded[-96:,:]
        X_trainL=X_train.reshape(X_train.shape[0],1,X_train.shape[1])
        time_steps = 1
    #Lstm
        model = get_model(X_train.shape[1])
        model.fit(X_trainL, y_train, batch_size=96, epochs=100, verbose =1)

        X_testL=X_test.reshape(X_test.shape[0],1,X_test.shape[1])

        pedL =one_hot_decode(model.predict(X_testL))
        true=one_hot_decode(X_true)
        try:
            accuracy = ((sum(x == y for x, y in zip(pedL, true)))/(len(pedL)))*100
        except ZeroDivisionError:
            accuracy = 0
        f.write(' %d,  %f \n'%(user, accuracy))

【讨论】：

我是机器学习的新手。你能帮我修改代码吗？我会很有帮助的。提前致谢。
@Krush23 希望上面的代码能帮助您理解重新启动权重的想法，以便您可以根据自己的情况定制解决方案。
发生错误<ipython-input-6-5316296259bb> in get_model() 3 def get_model(): 4 model = Sequential() ----> 5 model.add(LSTM(X_train.shape[1], input_shape=(1,X_train.shape[1]), activation='relu')) 6 model.add(Dense(X_train.shape[1])) 7 model.compile(loss='mse', optimizer='adam') 错误是NameError: name 'X_train' is not defined
Obs，我忘了解决这个问题！ X_train.shape[1] 对于每个用户来说是一样的还是在他们之间改变？而如果是静态的，X_train.shape[1]的数值是多少？
编码时每个用户都会改变。