【发布时间】:2020-01-27 23:09:21
【问题描述】:
我是 Keras、Tensorflow、Python 的新手,我正在尝试构建一个供个人使用/未来学习的模型。我刚开始使用 python,我想出了这段代码(在视频和教程的帮助下)。我的问题是,我对 Python 的内存使用量在每个时代都在缓慢上升,甚至在构建新模型之后也是如此。一旦内存达到 100%,训练就会停止,没有错误/警告。我不太了解,但问题应该在循环中的某个地方(如果我没记错的话)。我知道
k.clear.session()
但要么问题没有被删除,要么我不知道如何将它集成到我的代码中。 我有: Python v 3.6.4, Tensorflow 2.0.0rc1(cpu版), Keras 2.3.0
这是我的代码:
import pandas as pd
import os
import time
import tensorflow as tf
import numpy as np
import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM, BatchNormalization
from tensorflow.keras.callbacks import TensorBoard, ModelCheckpoint
EPOCHS = 25
BATCH_SIZE = 32
df = pd.read_csv("EntryData.csv", names=['1SH5', '1SHA', '1SA5', '1SAA', '1WH5', '1WHA',
'2SA5', '2SAA', '2SH5', '2SHA', '2WA5', '2WAA',
'3R1', '3R2', '3R3', '3R4', '3R5', '3R6',
'Target'])
df_val = 14554
validation_df = df[df.index > df_val]
df = df[df.index <= df_val]
train_x = df.drop(columns=['Target'])
train_y = df[['Target']]
validation_x = validation_df.drop(columns=['Target'])
validation_y = validation_df[['Target']]
train_x = np.asarray(train_x)
train_y = np.asarray(train_y)
validation_x = np.asarray(validation_x)
validation_y = np.asarray(validation_y)
train_x = train_x.reshape(train_x.shape[0], 1, train_x.shape[1])
validation_x = validation_x.reshape(validation_x.shape[0], 1, validation_x.shape[1])
dense_layers = [0, 1, 2]
layer_sizes = [32, 64, 128]
conv_layers = [1, 2, 3]
for dense_layer in dense_layers:
for layer_size in layer_sizes:
for conv_layer in conv_layers:
NAME = "{}-conv-{}-nodes-{}-dense-{}".format(conv_layer, layer_size,
dense_layer, int(time.time()))
tensorboard = TensorBoard(log_dir="logs\{}".format(NAME))
print(NAME)
model = Sequential()
model.add(LSTM(layer_size, input_shape=(train_x.shape[1:]),
return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
for l in range(conv_layer-1):
model.add(LSTM(layer_size, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
for l in range(dense_layer):
model.add(Dense(layer_size, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(2, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
# Compile model
model.compile(loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
# unique file name that will include the epoch
# and the validation acc for that epoch
filepath = "RNN_Final.{epoch:02d}-{val_accuracy:.3f}"
checkpoint = ModelCheckpoint("models\{}.model".format(filepath,
monitor='val_acc', verbose=0, save_best_only=True,
mode='max')) # saves only the best ones
# Train model
history = model.fit(
train_x, train_y,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_data=(validation_x, validation_y),
callbacks=[tensorboard, checkpoint])
# Score model
score = model.evaluate(validation_x, validation_y, verbose=2)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
# Save model
model.save("models\{}".format(NAME))
我也不知道是否有可能在 1 个问题中提出 2 个问题(我不想在这里用我的问题向它发送垃圾邮件,任何有任何 python 经验的人都可以在一分钟内解决),但我也检查点保存有问题。我只想保存性能最好的模型(每 1 个 NN 规范 1 个模型 - 节点/层数),但目前它在每个 epoch 后保存。如果这不合适问我可以为此创建另一个问题。
非常感谢您的帮助。
【问题讨论】:
-
我的回答是根据您提供的代码对问题根源的最佳猜测——可能还有其他原因;让我知道下面是否解决了内存问题
-
我在同一个脚本中训练不同模型时遇到了类似的问题。我在这里收集了一些可能的修复和解决方法:memory leak with Keras
标签: python tensorflow memory keras checkpoint