【问题标题】:Why is my VSCode trying to use cuda even though I installed directml (I'm on amd)?即使我安装了directml(我在amd上),为什么我的VSCode仍试图使用cuda?
【发布时间】:2021-09-16 18:07:30
【问题描述】:

我有一个张量流对象检测项目,我想构建并阅读它在 cpu 上会很慢。那时有人告诉我使用 directml,因为我有一个 AMD gpu 而不是 NVIDIA 的。

我创建了一个名为“directml”的 anaconda 环境,并在其上安装了 tensorflow 和 directml(参见 picture)。如果我现在尝试运行我从本教程 (https://docs.microsoft.com/en-us/windows/ai/directml/gpu-tensorflow-windows) 中找到的测试应用程序:

import tensorflow.compat.v1 as tf 

tf.enable_eager_execution(tf.ConfigProto(log_device_placement=True)) 

print(tf.add([1.0, 2.0], [3.0, 4.0]))

我没有得到想要的输出:


2020-06-15 11:27:18.240065: I tensorflow/core/common_runtime/dml/dml_device_factory.cc:32] DirectML: creating device on adapter 0 (AMD Radeon VII) 

2020-06-15 11:27:18.323949: I tensorflow/stream_executor/platform/default/dso_loader.cc:60] Successfully opened dynamic library DirectMLba106a7c621ea741d2159d8708ee581c11918380.dll 

2020-06-15 11:27:18.337830: I tensorflow/core/common_runtime/eager/execute.cc:571] Executing op Add in device /job:localhost/replica:0/task:0/device:DML:0 

tf.Tensor([4. 6.], shape=(2,), dtype=float32)

但我却得到了这个:

2021-09-16 17:15:03.700209: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_100.dll'; dlerror: cudart64_100.dll not found
2021-09-16 17:15:03.700418: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-09-16 17:15:05.192685: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2021-09-16 17:15:05.192902: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2021-09-16 17:15:05.197503: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-N3L36AL
2021-09-16 17:15:05.197857: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-N3L36AL
2021-09-16 17:15:05.198376: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-09-16 17:15:05.202832: I tensorflow/core/common_runtime/eager/execute.cc:571] Executing op Add in device /job:localhost/replica:0/task:0/device:CPU:0
tf.Tensor([4. 6.], shape=(2,), dtype=float32)

在我看来,tensorflow 似乎在尝试使用 cuda 而不是 directml,但我不知道为什么会这样。我的 Windows 以及我的 AMD 驱动程序都是最新的。

【问题讨论】:

    标签: python python-3.x tensorflow


    【解决方案1】:

    你不应该只安装 tensorflow tensorflow-directml。因为现在 python 正在导入 tensorflow 而不是 tensorflow-directml。卸载 tensorflow,它应该会修复导入。

    【讨论】:

    • 我试过了,但现在导入不再起作用:` import tensorflow.compat.v1 as tf ModuleNotFoundError: No module named 'tensorflow' `
    • 您是否尝试重新安装 tensorflow-directml?但我建议创建新的 conda 环境并再次安装 tensorflow-directml。
    • 这看起来不错,不是吗? hastebin.com/suqunatehi.apache
    • 看起来没问题,你也可以检查是否检测到gpu:print(tf.config.list_physical_devices('GPU'))
    • 那没用,但我后来用了“device_lib.list_local_devices()”,它输出了这个:hastebin.com/huzetakaka.less为什么说cpu?
    猜你喜欢
    • 2014-01-30
    • 2023-03-18
    • 1970-01-01
    • 2020-07-14
    • 1970-01-01
    • 1970-01-01
    • 2021-10-15
    • 1970-01-01
    相关资源
    最近更新 更多