【发布时间】:2022-10-15 15:48:24
【问题描述】:
最近买了一个 3060 并试图让它与 tensorflow 一起工作,但它似乎不起作用。 尽管可以检测到 GPU,但每当我训练 mask_rcnn_coco.h5 时,我都会花费大量时间,以至于我将其放置了大约 30 分钟,甚至没有完成 1 个 epoch。任何想法如何解决这一问题?
我使用了这些库
pip install tensorflow==2.3
pip install tensorflow--gpu==2.3
pip install imgaug
pip install pixellib==0.5.2
pip install labelme2coco==0.1.0
pip install Pillow==8.0
我安装了 CUDA 10.1 和 cuDNN 7.6。
会议
[I 20:24:21.746 NotebookApp] Kernel started: 0b6d1f66-f4ff-442f-bf6f-59bb5fe2ff03, name: python3
[IPKernelApp] ERROR | No such comm target registered: jupyter.widget.control
[IPKernelApp] WARNING | No such comm: 5db9fb8e-9956-4081-9c1d-c8e445ca997f
2022-10-12 20:24:40.214889: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
[W 20:24:43.199 NotebookApp] 404 GET /api/kernels/8eba5c9e-587f-4cd0-86db-7d5987a61f9b/channels?session_id=010d8cfef1df42cd835e128121663487 (::1): Kernel does not exist: 8eba5c9e-587f-4cd0-86db-7d5987a61f9b
[W 20:24:43.200 NotebookApp] 404 GET /api/kernels/8eba5c9e-587f-4cd0-86db-7d5987a61f9b/channels?session_id=010d8cfef1df42cd835e128121663487 (::1) 3.000000ms referer=None
2022-10-12 20:24:48.841665: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2022-10-12 20:24:57.703980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.777GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2022-10-12 20:24:57.704187: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2022-10-12 20:24:57.713341: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2022-10-12 20:24:57.718274: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2022-10-12 20:24:57.720302: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2022-10-12 20:24:57.726087: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2022-10-12 20:24:57.729356: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2022-10-12 20:24:58.054469: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2022-10-12 20:24:58.054702: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2022-10-12 20:25:01.424735: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-12 20:25:01.432727: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1fcea173490 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-10-12 20:25:01.432877: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device
(0): Host, Default Version
2022-10-12 20:25:01.433675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.777GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
import tensorflow as tf
tf.config.list_physical_devices('GPU')
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
【问题讨论】:
-
您是否尝试设置 tf.debugging.set_log_device_placement(True) 以查看它实际放置作业的位置?这里有一个记录设备放置的简单示例,然后在必要时进行手动控制,请参阅“记录设备放置”和“手动设备放置”:tensorflow.org/guide/gpu
标签: python tensorflow tensorflow2.0