【发布时间】:2020-04-27 08:53:30
【问题描述】:
我在带有 GPU 的 Windows 10 计算机上运行 Keras。我已经从 Tensorflow 1 转到了 Tensorflow 2,现在感觉拟合速度要慢得多,希望得到您的建议。
我正在测试 Tensorflow 是否通过以下语句看到 GPU:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
K._get_available_gpus()
给出回应
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 17171012743200670970
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 6682068255
locality {
bus_id: 1
links {
}
}
incarnation: 5711519511292622685
physical_device_desc: "device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1"
那么,这似乎表明 GPU 正在工作?
我正在训练 ResNet50 的修改版本,输入最多 10 张图像 (257x257x2)。它有 430 万个可训练参数。训练非常缓慢(可能需要几天)。部分代码如下所示:
import os,cv2,sys
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import scipy.io
import h5py
import time
from tensorflow.python.keras import backend as K
from tensorflow.python.keras.models import load_model
from tensorflow.python.keras import optimizers
from buildModelReduced_test import buildModelReduced
from tensorflow.keras.utils import plot_model
K.set_image_data_format('channels_last') #set_image_dim_ordering('tf')
sys.setrecursionlimit(10000)
# Check that gpu is running
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
K._get_available_gpus()
# Generator to read one batch at a time for large datasets
def imageLoaderLargeFiles(data_path, batch_size, nStars, nDatasets=0):
---
---
---
yield(train_in,train_target
# Repository for parameters
nStars = 10
img_rows = 257
img_cols = 257
bit_depth = 16
channels = 2
num_epochs = 1
batch_size = 8
data_path_train = 'E:/TomoA/large/train2'
data_path_validate = 'E:/TomoA/large/validate2'
nDatasets_train = 33000
nDatasets_validate = 8000
nBatches_train = nDatasets_train//(batch_size)
validation_steps = nDatasets_validate//(batch_size)
output_width = 35;
runSize = 'large'
restartFile = ‘’
#%% Train model
if restartFile == '':
model = buildModelReduced(nStars,img_rows, img_cols, output_width,\
batch_size=batch_size,channels=channels, use_l2_regularizer=True)
model.summary()
plot_model(model, to_file='model.png', show_shapes=True)
all_mae = []
adam=optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None,
decay=0.0, amsgrad=False)
model.compile(optimizer='adam',loss='MSE',metrics=['mae'])
history = model.fit_generator(imageLoaderLargeFiles(data_path_train,batch_size,nStars,nDatasets_train),
steps_per_epoch=nBatches_train,epochs=num_epochs,
validation_data=imageLoaderLargeFiles(data_path_validate,batch_size,nStars,nDatasets_validate),
validation_steps=validation_steps,verbose=1,workers=0,
use_multiprocessing=False, shuffle=False)
print('\nSaving model...\n')
if runSize == 'large':
model.save(runID + '_' + runSize + '.h5')
当我打开 Windows 任务管理器并查看 GPU 时,我看到内存分配为 6.5GB,复制活动不到 1%,CUDA 大约为 4%。磁盘活动低,我一次从 SSD 读取 1000 个数据集的缓存。请参阅下面的屏幕剪辑。我认为这表明 GPU 运行不佳。 CPU 负载为 19%。我使用 8 的批量大小,如果再大一点,就会出现资源耗尽错误。
关于如何进行或在哪里可以找到更多信息有任何想法吗?关于如何调整运行以充分利用 GPU 的任何经验法则?
【问题讨论】:
标签: tensorflow keras windows-10 gpu