无法创建 cudnn 句柄：CUDNN_STATUS_ALLOC_FAILED。尝试保存特征图时出错答案

【问题标题】：Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED. Error when trying to save the feature maps无法创建 cudnn 句柄：CUDNN_STATUS_ALLOC_FAILED。尝试保存特征图时出错
【发布时间】：2020-05-03 14:02:57
【问题描述】：

我想保存当我将图像作为 VGG16 的输入时生成的特征图。但我收到了这个错误。请帮帮我:)

2020-05-03 19:31:42.361061: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-03 19:31:43.634465: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-05-03 19:31:43.638077: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-05-03 19:31:43.641090: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[{{node block1_conv1/convolution}}]]
Traceback (most recent call last):
  File "C:\Users\sreec\OneDrive\Desktop\8th Sem\Project\Test\Test.py", line 44, in <module>
    feature_maps = model.predict(img)
  File "D:\Anaconda\lib\site-packages\keras\engine\training.py", line 1462, in predict
    callbacks=callbacks)
  File "D:\Anaconda\lib\site-packages\keras\engine\training_arrays.py", line 324, in predict_loop
    batch_outs = f(ins_batch)
  File "D:\Anaconda\lib\site-packages\tensorflow_core\python\keras\backend.py", line 3727, in __call__
    outputs = self._graph_fn(*converted_inputs)
  File "D:\Anaconda\lib\site-packages\tensorflow_core\python\eager\function.py", line 1551, in __call__
    return self._call_impl(args, kwargs)
  File "D:\Anaconda\lib\site-packages\tensorflow_core\python\eager\function.py", line 1591, in _call_impl
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "D:\Anaconda\lib\site-packages\tensorflow_core\python\eager\function.py", line 1692, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "D:\Anaconda\lib\site-packages\tensorflow_core\python\eager\function.py", line 545, in call
    ctx=ctx)
  File "D:\Anaconda\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[node block1_conv1/convolution (defined at D:\Anaconda\lib\site-packages\keras\backend\tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_458]

Function call stack:
keras_scratch_graph

Press any key to continue . . .

【问题讨论】：

是什么引发了这个错误？我们需要比这更多的信息。请参阅How to Ask、help center。
没关系。我实际上发现我的 VRAM 不足以训练我的模型

标签： python python-3.x tensorflow keras anaconda

【解决方案1】：

您可以通过limiting GPU memory growth尝试GPU内存资源管理。

这可以通过调用 tf.config.experimental.set_memory_growth 来完成，它尝试只分配运行时分配所需的 GPU 内存：它开始分配非常少的内存，并且随着程序运行并需要更多的 GPU 内存，我们扩展了分配给 TensorFlow 进程的 GPU 内存区域。

要为特定 GPU 启用内存增长，请在分配任何张量或执行任何操作之前使用以下代码。

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)

【讨论】：