【问题标题】:Could not load dynamic library 'cudart64_101.dll'; dlerror无法加载动态库“cudart64_101.dll”;错误
【发布时间】:2020-12-22 00:07:44
【问题描述】:

我目前在仅使用 CPU 的设置(没错,没有 GPU)上运行我的 LSTM 训练脚本,并且我在每个训练步骤之间都有大量的行。 如何处理这些“dlerror”以及如何解决?

这会影响性能吗? 如果没有,如何隐藏?

Epoch 1/20
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
  1/369 [..............................] - ETA: 0s - loss: 115175354838024192.0000 - mse: 116338741219426304.0000 - mae: 174372896.0000 - mape: 65089616.0000 - cosine_similarity: -0.0534
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.

2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:51:55.732652: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53

2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:51:55.732652: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53
2020-09-02 20:51:56.905231: I tensorflow/core/profiler/rpc/client/save_profile.cc:174] Dumped gzipped tool data for trace.json.gz to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.trace.json.gz
  2/369 [..............................] - ETA: 4:00:34 - loss: 108831026716868608.0000 - mse: 109930332186214400.0000 - mae: 169439408.0000 - mape: 116326512.0000 - cosine_similarity: -0.2260
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:51:55.732652: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53
2020-09-02 20:51:56.905231: I tensorflow/core/profiler/rpc/client/save_profile.cc:174] Dumped gzipped tool data for trace.json.gz to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.trace.json.gz
2020-09-02 20:51:57.700621: I tensorflow/core/profiler/utils/event_span.cc:288] Generation of step-events took 54.008 ms

2020-09-02 20:51:57.763778: I tensorflow/python/profiler/internal/profiler_wrapper.cc:87] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53Dumped tool data for overview_page.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.overview_page.pb
Dumped tool data for input_pipeline.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.kernel_stats.pb

  4/369 [..............................] - ETA: 5:43:42 - loss: 99759720780267520.0000 - mse: 100767406007255040.0000 - mae: 139372400.0000 - mape: 186370256.0000 - cosine_similarity: -0.2431 

我已经研究了其他问题,但找不到实用的解决方案。

【问题讨论】:

    标签: python tensorflow keras lstm


    【解决方案1】:

    下面的行告诉您它无法加载库“cudart64_101.dll”。该库是CUDA 的运行时,这是一个用于在 Nvidia GPU 上进行并行计算的 API。其他错误也提到了与 CUDA 相关的“nvcuda.dll”。

    2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
    

    如果您查看已粘贴的日志,下一行会告诉您,如果您不使用 GPU,请不要担心。由于您没有使用 GPU,因此不必担心。

    2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
    

    如果你仍然想隐藏它们,我认为changing the logging level could help。虽然,它也可以隐藏其他重要信息......我个人会忽略它们,就像他们告诉我的那样。

    【讨论】:

    • 主要问题是它在每一步都会创建一个新的加载栏(加载栏我的意思是 1/369 [....... …………] 2/369 […………] 3/ 369 [.......................] 而不是完成 369。有没有办法只阻止这个?跨度>
    • 这当然很烦人。但是,除了更改日志记录级别和隐藏警告之外,我想不出任何办法。
    【解决方案2】:

    从技术上讲,这不是错误,而是警告。它主要是冗长的,所以你可以放心地忽略它。要使其静音,请按正确的顺序。您现在还需要重新启动内核。

    import os
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
    
    import tensorflow as tf
    

    【讨论】:

    • 主要问题是它在每一步都会创建一个新的加载栏(加载栏我的意思是 1/369 [....... …………] 2/369 […………] 3/ 369 [.......................] 而不是完成 369。有没有办法只阻止这个?跨度>
    猜你喜欢
    • 1970-01-01
    • 2020-11-23
    • 2020-05-06
    • 2014-11-08
    • 1970-01-01
    • 2020-06-23
    • 2022-10-18
    • 2020-07-19
    相关资源
    最近更新 更多