【问题标题】:UnknownError: Failed to get convolution algorithmUnknownError:获取卷积算法失败
【发布时间】:2020-05-09 10:20:05
【问题描述】:

完全错误:

UnknownError:获取卷积算法失败。这大概是 因为cuDNN初始化失败,所以试试看是否有警告 上面打印了日志消息。 [操作:Conv2D]

用于安装包的命令:

conda install -c anaconda keras-gpu

已安装:

  • 张量板 2.0.0 pyhb38c66f_1
  • 张量流 2.0.0 gpu_py37h57d29ca_0
  • tensorflow-base 2.0.0 gpu_py37h390e234_0
  • tensorflow-estimator 2.0.0 pyh2649769_0
  • tensorflow-gpu 2.0.0 h0d30ee6_0 anaconda
  • cudatoolkit 10.0.130 0
  • cudnn 7.6.5 cuda10.0_0
  • keras 应用程序 1.0.8 py_0
  • keras-base 2.2.4 py37_0
  • keras-gpu 2.2.4 0 anaconda
  • keras 预处理 1.1.0 py_1

我已尝试从 nvidia 网站安装 cuda-toolkit,但未解决问题,因此建议与 conda 命令相关。

一些博客建议安装 Visual Studio,但如果我有 spyder IDE 需要什么?

代码:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Convolution2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense

classifier = Sequential()

classifier.add(Convolution2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu'))

classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Convolution2D(32, 3, 3, activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Flatten())

classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))

classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 4,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 4,
                                            class_mode = 'binary')

classifier.fit_generator(training_set,
                         steps_per_epoch = 8000,
                         epochs = 25,
                         validation_data = test_set,
                         validation_steps = 2000)

执行以下代码后出现错误:

classifier.fit_generator(training_set,
                             steps_per_epoch = 8000,
                             epochs = 25,
                             validation_data = test_set,
                             validation_steps = 2000)

编辑 1:追溯

Traceback (most recent call last):

  File "D:\Machine Learning\Machine Learning A-Z Template Folder\Part 8 - Deep Learning\Section 40 - Convolutional Neural Networks (CNN)\cnn.py", line 70, in <module>
    validation_steps = 2000)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 1297, in fit_generator
    steps_name='steps_per_epoch')

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_generator.py", line 265, in model_iteration
    batch_outs = batch_function(*batch_data)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 973, in train_on_batch
    class_weight=class_weight, reset_metrics=reset_metrics)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 264, in train_on_batch
    output_loss_metrics=model._output_loss_metrics)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 311, in train_on_batch
    output_loss_metrics=output_loss_metrics))

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 252, in _process_single_batch
    training=training))

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 127, in _model_loss
    outs = model(inputs, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 891, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\sequential.py", line 256, in call
    return super(Sequential, self).call(inputs, training=training, mask=mask)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\network.py", line 708, in call
    convert_kwargs_to_constants=base_layer_utils.call_context().saving)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\network.py", line 860, in _run_internal_graph
    output_tensors = layer(computed_tensors, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 891, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\keras\layers\convolutional.py", line 197, in call
    outputs = self._convolution_op(inputs, self.kernel)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 1134, in __call__
    return self.conv_op(inp, filter)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 639, in __call__
    return self.call(inp, filter)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 238, in __call__
    name=self.name)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\nn_ops.py", line 2010, in conv2d
    name=name)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py", line 1031, in conv2d
    data_format=data_format, dilations=dilations, name=name, ctx=_ctx)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py", line 1130, in conv2d_eager_fallback
    ctx=_ctx, name=name)

  File "C:\Anaconda\envs\ML\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)

  File "<string>", line 3, in raise_from

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

【问题讨论】:

  • 请添加完整的错误跟踪。
  • 同时查看天气tf.test.is_gpu_available() 返回TrueFalse
  • @Vivek Mehta 当然,视觉工作室在这里是强制性的吗?
  • 不,不需要。
  • @Vivek Mehta 我已经添加了回溯,请检查'edit 1'

标签: python tensorflow keras conv-neural-network tf.keras


【解决方案1】:

以下代码解决了这个问题:

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)

    except RuntimeError as e:
        print(e)

【讨论】:

  • 嘿,我只是想知道为什么这解决了这个问题?谢谢
【解决方案2】:

错误来自以下事实:

  • CUDA 版本
  • CuDNN 版本
  • TensorFlow 版本

在下面的答案中,我提供了 tensorflow、cuda 和 cudnn 的工作组合。请看一下和你类似的问题:Tensorflow 2.0 can't use GPU, something wrong in cuDNN? :Failed to get convolution algorithm. This is probably because cuDNN failed to initialize

例如。 Cuda 10.0 + CuDNN 7.6.3 + / TensorFlow 1.13/1.14 / TensorFlow 2.0。

Eg2 Cuda 9 + CuDNN 7.0.5 + TensorFlow 1.10 工作

【讨论】:

  • 感谢您的回复,但我的版本是否不匹配? ,因为我已经执行了 conda 命令,并且它本身下载了其余的依赖项,所以我认为它们应该是兼容的,但是你仍然可以验证一次,因为我已经对此进行了大量的排列。
  • 请使用简单的 pip 安装,而不是 conda。从pip安装,而不是conda安装,在单独的工作环境中进行,以免污染全局环境
  • 虽然我也尝试过 pip 但为什么 pip 超过 conda ?好吧,照你说的做,告诉我将安装 tf, cuda n cudnn 的确切命令,以避免不匹配。
猜你喜欢
  • 2021-11-14
  • 1970-01-01
  • 2019-10-05
  • 1970-01-01
  • 1970-01-01
  • 2019-06-18
  • 2021-01-05
  • 1970-01-01
  • 2020-04-14
相关资源
最近更新 更多