【发布时间】:2021-03-13 04:54:30
【问题描述】:
我正在 TensorFlow 2 上训练 U-net。当我加载模型时,它几乎占用了 GPU 的所有内存(22 GB 超过 26 GB),尽管我的模型应该最多占用 1.5 GB具有 1.9 亿个参数的内存。为了理解这个问题,我尝试加载一个没有任何层的模型,令我惊讶的是它仍然占用了相同数量的内存。我的模型的代码附在下面:
x = tf.keras.layers.Input(shape=(256,256,1))
model = Sequential(
[
Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
MaxPooling2D(pool_size=(2, 2)),
Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
MaxPooling2D(pool_size=(2, 2)),
Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
MaxPooling2D(pool_size=(2, 2)),
Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
MaxPooling2D(pool_size=(2, 2)),
Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Activation('relu')(Add()([conv5_0, conv5_2])),
MaxPooling2D(pool_size=(2, 2)),
Conv2D(2048, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(2048, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(2048, 3, padding = 'same', kernel_initializer = 'he_normal'),
UpSampling2D(size = (2,2)),
Conv2D(1024, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
UpSampling2D(size = (2,2)),
Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
UpSampling2D(size = (2,2)),
Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
UpSampling2D(size = (2,2)),
Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
UpSampling2D(size = (2,2)),
Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal'),
Conv2D(1, 3, activation = 'linear', padding = 'same', kernel_initializer = 'he_normal')
])
y = model(x)
我注释掉了所有层,它仍然占用 22 GB。我正在使用 jupyter-notebook 来运行代码。我认为在我的 jupyter notebook 的开头添加tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=x) 可以解决问题,但它没有。我的目标是在 GPU 上同时运行多个脚本,以更有效地利用我的时间。任何帮助将非常感激。谢谢。
注意:刚刚注意到它不仅发生在这段代码中,而且发生在任何其他 Tensorflow 模块上。例如,在我的代码的某个时刻,我在加载模型之前使用了tf.signal.ifft2,它也占用了与模型几乎相同的内存。如何解决这个问题?
【问题讨论】:
标签: python deep-learning tensorflow2.0