【发布时间】:2019-11-27 00:25:18
【问题描述】:
最近我决定从 Tensorflow 的 1.14 版(gpu 变体)迁移到当前的 2.0 版。
我目前的设置是:
- Tensorflow(GPU 变体)2.0
- Cudnn 7.6.4
- CUDA 10
- Python 3.6
- IDE:Visual Studio 2019
我确实预计会有一些痛苦,但这让我措手不及。
当我尝试运行我的一个(现已调整)1.14 项目时,使用 now 构建的模型出现问题,并且训练过程顺利开始。只有在第三步之后才能完全停止。 同一个项目在 Tensorflow 2.0 的 cpu 变体上运行得很好,但训练所有模型需要几个数量级的时间。
这是我目前尝试过的:
- 更改超参数
- 重新安装 CUDA
- 重新安装张量流
- 重新安装cudnn
- 禁用验证
- 检查路径变量
这些都没有帮助解决这个问题。我唯一的线索是警告信息:
Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
我从未使用过 Tf 1.14 并且对此感到有些困惑。 我知道 CUDA 可以工作,因为我编译并运行了几个 Nvidia 示例。所以剩下的唯一真正的选择是与 Tensorflow 或它如何处理 gpus 相关。
但我不知道如何前进。
会话日志如下:
019-11-27 01:03:57.910895: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pandas\core\frame.py:4117: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
errors=errors,
2019-11-27 01:04:02.247959: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-11-27 01:04:02.277414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.835
pciBusID: 0000:0a:00.0
2019-11-27 01:04:02.282378: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-27 01:04:02.286653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-27 01:04:02.289629: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-11-27 01:04:02.295084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.835
pciBusID: 0000:0a:00.0
2019-11-27 01:04:02.299843: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-27 01:04:02.303965: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-27 01:04:03.043700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-27 01:04:03.047132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2019-11-27 01:04:03.049453: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2019-11-27 01:04:03.052642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6382 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:0a:00.0, compute capability: 6.1)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 154, 64) 896000
_________________________________________________________________
conv1d (Conv1D) (None, 150, 64) 20544
_________________________________________________________________
flatten (Flatten) (None, 9600) 0
_________________________________________________________________
dense (Dense) (None, 300) 2880300
_________________________________________________________________
dense_1 (Dense) (None, 150) 45150
_________________________________________________________________
dense_2 (Dense) (None, 70) 10570
_________________________________________________________________
dense_3 (Dense) (None, 10) 710
_________________________________________________________________
dense_4 (Dense) (None, 2) 22
=================================================================
Total params: 3,853,296
Trainable params: 3,853,296
Non-trainable params: 0
_________________________________________________________________
Train for 10 steps, validate for 50 steps
Epoch 1/40
2019-11-27 01:04:06.199581: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2019-11-27 01:04:06.430358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-11-27 01:04:07.180709: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2019-11-27 01:04:07.425377: I tensorflow/core/profiler/lib/profiler_session.cc:184] Profiler session started.
2019-11-27 01:04:07.431736: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cupti64_100.dll
1/10 [==>...........................] - ETA: 32s - loss: 0.6933 - accuracy: 0.4375 - categorical_accuracy: 0.4375 - precision: 0.4375 - recall: 0.43752019-11-27 01:04:07.655586: I tensorflow/core/platform/default/device_tracer.cc:588] Collecting 148 kernel records, 21 memcpy records.
WARNING: Logging before flag parsing goes to stderr.
W1127 01:04:07.730274 5696 callbacks.py:244] Method (on_train_batch_end) is slow compared to the batch update (0.138531). Check your callbacks.
3/10 [========>.....................] - ETA: 9s - loss: 0.6167 - accuracy: 0.7000 - categorical_accuracy: 0.7000 - precision: 0.7000 - recall: 0.7000
【问题讨论】:
-
似乎最大的支持是在 Linux 上。目前还没有解决办法。
标签: python-3.x tensorflow tensorflow2.0