【发布时间】:2020-12-22 00:07:44
【问题描述】:
我目前在仅使用 CPU 的设置(没错,没有 GPU)上运行我的 LSTM 训练脚本,并且我在每个训练步骤之间都有大量的行。 如何处理这些“dlerror”以及如何解决?
这会影响性能吗? 如果没有,如何隐藏?
Epoch 1/20
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
1/369 [..............................] - ETA: 0s - loss: 115175354838024192.0000 - mse: 116338741219426304.0000 - mae: 174372896.0000 - mape: 65089616.0000 - cosine_similarity: -0.0534
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:51:55.732652: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:51:55.732652: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53
2020-09-02 20:51:56.905231: I tensorflow/core/profiler/rpc/client/save_profile.cc:174] Dumped gzipped tool data for trace.json.gz to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.trace.json.gz
2/369 [..............................] - ETA: 4:00:34 - loss: 108831026716868608.0000 - mse: 109930332186214400.0000 - mae: 169439408.0000 - mape: 116326512.0000 - cosine_similarity: -0.2260
2020-09-02 20:49:06.592450: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-09-02 20:49:06.599065: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 20:49:12.746036: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-09-02 20:49:12.751769: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 20:49:12.761444: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-6NCF44I
2020-09-02 20:49:12.763713: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-6NCF44I
2020-09-02 20:49:12.800139: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x229365115b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 20:49:12.802774: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-02 20:49:16.012887: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:50:39.226962: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-09-02 20:51:55.732652: I tensorflow/core/profiler/rpc/client/save_profile.cc:168] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53
2020-09-02 20:51:56.905231: I tensorflow/core/profiler/rpc/client/save_profile.cc:174] Dumped gzipped tool data for trace.json.gz to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.trace.json.gz
2020-09-02 20:51:57.700621: I tensorflow/core/profiler/utils/event_span.cc:288] Generation of step-events took 54.008 ms
2020-09-02 20:51:57.763778: I tensorflow/python/profiler/internal/profiler_wrapper.cc:87] Creating directory: C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53Dumped tool data for overview_page.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.overview_page.pb
Dumped tool data for input_pipeline.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to C:\Users\Ben\Desktop\colab_comparison\logs.txt\train\plugins\profile\2020_09_02_18_51_53\DESKTOP-6NCF44I.kernel_stats.pb
4/369 [..............................] - ETA: 5:43:42 - loss: 99759720780267520.0000 - mse: 100767406007255040.0000 - mae: 139372400.0000 - mape: 186370256.0000 - cosine_similarity: -0.2431
我已经研究了其他问题,但找不到实用的解决方案。
【问题讨论】:
标签: python tensorflow keras lstm