CUDA在mac上找不到GPU答案

【问题标题】：CUDA can't find GPU on macCUDA在mac上找不到GPU
【发布时间】：2017-05-27 17:03:00
【问题描述】：

几天前，我设法让 CUDA 在我的带有 GeForce GTX 780M 的 Mac 上使用 tensorflow。但是今天我注意到它不再起作用了。我不确定发生了什么变化，但我已经验证了库（尤其是 cudann）仍然正确安装。

重启和重新安装 tensorflow 都没有帮助（我从 https://storage.googleapis.com/tensorflow/mac/gpu/tensorflow_gpu-0.12.1-py3-none-any.whl 安装了 tensorflow）这是从 tensorflow 网站运行 mnist 示例的输出：

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.dylib locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.dylib locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.dylib locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.1.dylib locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.dylib locally
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUDA_ERROR_NO_DEVICE
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: Net-iMac-3.local
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: Net-iMac-3.local
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 310.42.25
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Invalid argument: expected %d.%d or %d.%d.%d form for driver version; got ""
step 0, training accuracy 0.06

这是nvcc -V的输出：

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Oct_30_22:18:43_CDT_2016
Cuda compilation tools, release 8.0, V8.0.54

输出：ls -l /usr/local/cuda/lib/libcud*

lrwxr-xr-x  1 mik   admin    33B Jan  7 16:29 /usr/local/cuda/lib/libcuda.1.dylib -> /usr/local/cuda/lib/libcuda.dylib
-rwxr-xr-x@ 1 root  wheel    13K Nov  3 19:39 /usr/local/cuda/lib/libcuda.dylib
lrwxr-xr-x@ 1 root  wheel    45B Nov  3 19:40 /usr/local/cuda/lib/libcudadevrt.a -> /Developer/NVIDIA/CUDA-8.0/lib/libcudadevrt.a
lrwxr-xr-x@ 1 root  wheel    50B Nov  3 19:40 /usr/local/cuda/lib/libcudart.8.0.dylib -> /Developer/NVIDIA/CUDA-8.0/lib/libcudart.8.0.dylib
lrwxr-xr-x@ 1 root  wheel    46B Nov  3 19:40 /usr/local/cuda/lib/libcudart.dylib -> /Developer/NVIDIA/CUDA-8.0/lib/libcudart.dylib
lrwxr-xr-x@ 1 root  wheel    49B Nov  3 19:40 /usr/local/cuda/lib/libcudart_static.a -> /Developer/NVIDIA/CUDA-8.0/lib/libcudart_static.a
-rwxr-xr-x@ 1 mik   staff    74M Jul 27 09:18 /usr/local/cuda/lib/libcudnn.5.dylib
lrwxr-xr-x@ 1 mik   staff    16B Jul 27 09:21 /usr/local/cuda/lib/libcudnn.dylib -> libcudnn.5.dylib
-rw-r--r--@ 1 mik   staff    63M Jul 27 09:18 /usr/local/cuda/lib/libcudnn_static.a

我尝试过重新安装驱动程序，安装旧的驱动程序，但没有任何帮助

根据https://github.com/aymericdamien/TensorFlow-Examples/issues/38 我做了export CUDA_VISIBLE_DEVICES=1 以防止在运行 tensorflow 时出现内存问题。如果我然后运行./deviceQuery，它将无法找到 gpu：

/Developer/NVIDIA/CUDA-8.0/samples/bin/x86_64/darwin/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL

但是，如果我运行 export CUDA_VISIBLE_DEVICES=0，那么运行 ./deviceQuery 会给出：

/Developer/NVIDIA/CUDA-8.0/samples/bin/x86_64/darwin/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 780M"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    3.0
  Total amount of global memory:                 4096 MBytes (4294508544 bytes)
  ( 8) Multiprocessors, (192) CUDA Cores/MP:     1536 CUDA Cores
  GPU Max Clock rate:                            784 MHz (0.78 GHz)
  Memory Clock rate:                             2500 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 524288 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA

【问题讨论】：

使用 CUDA MaOS 安装指南中给出的过程验证您的 CUDA 安装。
运行./deviceQuery 找不到gpu
这可能是因为我的 gpu 与 cuda 8.0 相比相对较旧吗？
GTX780M 完全受 CUDA 8.0 支持。看来您的 CUDA 安装已损坏。也许您应该重新安装 CUDA 或继续在您的系统上工作，直到 CUDA 可以通过简单的验证测试，而不是想知道为什么 tensorflow 不起作用。
在失败的情况下，请执行echo $CUDA_VISIBLE_DEVICES Nevermind。是的，如果您使用export CUDA_VISIBLE_DEVICES=1 并且您的系统中只有 1 个 GPU，这将导致 CUDA 无法正常工作。我不确定你为什么要那样做。该环境变量记录在here。

标签： macos python-3.x cuda tensorflow gpu

【解决方案1】：

"CUDA_VISIBLE_DEVICES=1" 表示 Cuda 只能在您的机器上看到 gpu_1。你有两个 GPU 吗？ “nvidia-smi”显示什么？

一般来说，如果您想使用“CUDA_VISIBLE_DEVICES”，请确保指向您要使用的 GPU。

【讨论】：