【问题标题】：Tensorflow: Cuda compute capability 3.0. The minimum required Cuda capability is 3.5Tensorflow：Cuda 计算能力 3.0。所需的最低 Cuda 能力为 3.5
【发布时间】：2016-12-25 16:51:11
【问题描述】：

我正在从源代码(documentation) 安装 tensorflow。

Cuda 驱动版本：

nvcc: NVIDIA (R) Cuda compiler driver
Cuda compilation tools, release 7.5, V7.5.17

当我运行以下命令时：

bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu

它给了我以下错误：

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:118] Found device 0 with properties: 
name: GeForce GT 640
major: 3 minor: 0 memoryClockRate (GHz) 0.9015
pciBusID 0000:05:00.0
Total memory: 2.00GiB
Free memory: 1.98GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:138] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:148] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
Aborted (core dumped)

我需要不同的 gpu 来运行它吗？

【问题讨论】：

配置Tensorflow时需要指定计算能力3.0支持。请参阅：tensorflow.org/versions/r0.10/get_started/os_setup.html 和 github.com/tensorflow/tensorflow/issues/25
我使用TF_UNOFFICIAL_SETTING=1 ./configure 配置，然后在bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer 之后运行bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu。它仍然给我同样的错误
在运行 ./configure 时是否明确要求支持计算能力 3.0？
它现在运行得很漂亮。非常感谢！

标签： python tensorflow gpu bazel

【解决方案1】：

对于 TensorFlow 2.1.0

我能够通过编译 TF2.1.0 的源代码在 Windows 上对其进行管理。由于 XLA 原因，TF 2.2.0 构建失败，即使为 bazel 禁用了所有 XLA 标志。还要小心使用更新的 Python 版本 - 我在使用 Python 3.8 的预构建 pip 包中遇到了一些奇怪的错误，所以我使用 Python 3.6 来解决这个问题。

一个警告 - 构建完成几个小时后，我开始使用该库，一个仅持续几秒钟的简单模型训练效果很好，但基本卷积网络的训练在 0 或 1 个 epochs 后失败了到 CUDA 错误。您的里程可能会有所不同。

【讨论】：

【解决方案2】：

感谢您提供 WHL！当我为编译它而奋斗了好几天（没有成功）时，我现在终于能够使用 TF，因为我的笔记本电脑只支持 Compute 3.0。在全新安装 Ubuntu 18.04 时，我无法按照您的说明进行编译，我想指出几点：

在您的“依赖项”部分，libjasper 不再独立可用，ffmpeg 不再从您列出的存储库中可用，并且 libtiff5-dev 不再可用（我认为有一个新版本）。我知道这主要是针对我也使用的 OpenCV 的东西。您还重复了几个包，例如 git 和 unzip。
在您的“Nvidia 驱动程序”部分，我认为存储库中没有该驱动程序。至少我拉不下来。使用您构建的 WHL 文件，我使用的是 Nvidia 网站上的 418 驱动程序，这似乎运行良好。
在“为 CUDA 9.0 安装 cudnn 7.1.4”部分中，您“cd /usr/lib/x86_64-linux-gnu”，但文件位于 /usr/local/cuda。它是否正确？我猜这些链接至少必须被告知指向 cuda 文件夹。
在“为 CUDA 9.0 安装 NCCL 2.2.12”部分中，您使用的是 2.2.12，但您的命令行均引用 2.1.15
在您的 Bazel 安装部分，您说要使用 Bazel Darwin 安装程序，但我认为这适用于 Mac。我认为您需要 Bazel Linux 安装程序。

再次感谢您为此所做的所有工作！

附：我能够通过按照这些说明对 Tensorflow 1.12 进行 git checkout 并通过使用 Bazel 0.15.0 使用 CUDA 9.2、CUDNN 7.1.4 和 NCCL 2、2、13 安装 keras_applications 和 keras_preprocessing 来构建它。有些人指出 CUDA 9.0 不能用 gcc6/g++6 编译。显然9.2可以。

【讨论】：

【解决方案3】：

在 anaconda 中，具有 cudatoolkit=9.0 的 tensorflow-gpu=1.12 与具有 3.0 计算能力的 gpu 兼容。这是创建新环境和安装 3.0 gpus 所需库的 c 命令。

conda create -n tf-gpu
conda activate tf-gpu
conda install tensorflow-gpu=1.12
conda install cudatoolkit=9.0

那么你可以通过以下方式尝试。

>python
import tensorflow as tf
tf.Session()

这是我的输出

名称：GeForce GT 650M 主要：3 次要：0 memoryClockRate(GHz)：0.95 pciBusID: 0000:01:00.0 总内存：3.94GiB 免费内存：3.26GiB 2019-12-09 13:26:11.753591：我 tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] 添加可见 gpu 设备：0 2019-12-09 13:26:12.050152：我 tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 设备互连 StreamExecutor 与强度 1 边缘矩阵： 2019-12-09 13:26:12.050199：我 tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2019-12-09 13:26:12.050222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2019-12-09 13:26:12.050481：我 tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 创建了 TensorFlow 设备（/job:localhost/replica:0/task:0/device:GPU:0 和 2989 MB 内存）-> 物理 GPU（设备：0，名称：GeForce GT 650M，pci 总线 ID：0000:01:00.0，计算能力：3.0）

享受吧！

【讨论】：

谢谢，我花了很多时间在我的 GT 750M 旧笔记本电脑上处理依赖项和驱动程序，但 Conda 解决了我的问题。
Conda 也解决了这个问题。较旧的 NVIDIA 卡似乎适用于具有相应较低依赖包版本的特定较低 tensorflow-gpu 版本。

【解决方案4】：

@Taako，很抱歉这么晚才回复。我没有保存上面显示的编译的轮文件。但是，这是 tensorflow 1.9 的新版本。希望这对您有足够的帮助。请确保以下用于构建的详细信息。

张量流：1.9 CUDA 工具包：9.2 CUDNN：7.1.4 NCCL：2.2.13

以下是wheel文件的链接： wheel file

【讨论】：

我还为 Tensorflow 1.12、CUDNN 7.2.1、NCCL: 2.2.13 构建了一个轮子。如果您需要联系我，可以在 MATLAB 和 Octave 聊天室给我发消息：chat.stackoverflow.com/rooms/81987/chatlab-and-talktave
伙计们，是否可以为 windows 编译 TF2 以实现 cuda 兼容性 3.0？编译TF1.x有一些tuts
@Mehdi 我能够通过编译 TF2.1.0 的源代码在 Windows 上对其进行管理。由于 XLA 原因，TF 2.2.0 构建失败，即使为 bazel 禁用了所有 XLA 标志。还要小心使用更新的 Python 版本——我在使用 Python 3.8 的预构建 pip 包中遇到了一些奇怪的错误，所以我使用 Python 3.6 来解决这个问题。
@Chris，请问您可以分享您的构建吗？

【解决方案5】：

我已经安装了 Tensorflow 1.8 版。它推荐 CUDA 9.0。我正在使用具有 CUDA 计算能力 3.0 的 GTX 650M 卡，现在工作起来就像一个魅力。操作系统是 ubuntu 18.04。以下是详细步骤：

安装依赖项

我已经为我的 opencv 3.4 编译包含了 ffmpeg 和一些相关的包，如果不需要，请不要安装运行以下命令：

sudo apt-get update 
sudo apt-get dist-upgrade -y
sudo apt-get autoremove -y
sudo apt-get upgrade
sudo add-apt-repository ppa:jonathonf/ffmpeg-3 -y
sudo apt-get update
sudo apt-get install build-essential -y
sudo apt-get install ffmpeg -y
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev -y
sudo apt-get install python-dev libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev -y
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev -y
sudo apt-get install libxvidcore-dev libx264-dev -y
sudo apt-get install unzip qtbase5-dev python-dev python3-dev python-numpy python3-numpy -y
sudo apt-get install libopencv-dev libgtk-3-dev libdc1394-22 libdc1394-22-dev libjpeg-dev libpng12-dev libtiff5-dev >libjasper-dev -y
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libxine2-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev -y
sudo apt-get install libv4l-dev libtbb-dev libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev -y
sudo apt-get install libvorbis-dev libxvidcore-dev v4l-utils vtk6 -y
sudo apt-get install liblapacke-dev libopenblas-dev libgdal-dev checkinstall -y
sudo apt-get install libgtk-3-dev -y
sudo apt-get install libatlas-base-dev gfortran -y
sudo apt-get install qt-sdk -y
sudo apt-get install python2.7-dev python3.5-dev python-tk -y
sudo apt-get install cython libgflags-dev -y
sudo apt-get install tesseract-ocr -y
sudo apt-get install tesseract-ocr-eng -y 
sudo apt-get install tesseract-ocr-ell -y
sudo apt-get install gstreamer1.0-python3-plugin-loader -y
sudo apt-get install libdc1394-22-dev -y
sudo apt-get install openjdk-8-jdk
sudo apt-get install pkg-config zip g++-6 gcc-6 zlib1g-dev unzip  git
sudo wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install -U pip
sudo pip install -U numpy
sudo pip install -U pandas
sudo pip install -U wheel
sudo pip install -U six

安装英伟达驱动

运行以下命令：

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-390 -y

重新启动并运行以下命令，它应该会为您提供如下图所示的详细信息：

gcc-6 和 g++-6 检查。

CUDA 9.0 需要 gcc-6 和 g++-6，运行以下命令：

cd /usr/bin 
sudo rm -rf gcc gcc-ar gcc-nm gcc-ranlib g++
sudo ln -s gcc-6 gcc
sudo ln -s gcc-ar-6 gcc-ar
sudo ln -s gcc-nm-6 gcc-nm
sudo ln -s gcc-ranlib-6 gcc-ranlib
sudo ln -s g++-6 g++

安装 CUDA 9.0

转到https://developer.nvidia.com/cuda-90-download-archive。选择选项：Linux->x86_64->Ubuntu->17.04->deb(local)。下载主文件和两个补丁。运行以下命令：

sudo dpkg -i cuda-repo-ubuntu1704-9-0-local_9.0.176-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

在您的PC上导航到第一个补丁并双击它，它将自动执行，第二个补丁也是如此。

在 ~/.bashrc 文件中添加以下行并重新启动它：

export PATH=/usr/local/cuda-9.0/bin${PATH:+:$PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

为 CUDA 9.0 安装 cudnn 7.1.4

从https://developer.nvidia.com/cudnn 下载 tar 文件并将其解压缩到您的下载文件夹下载需要nvidia开发的登录，免费注册运行以下命令：

cd ~/Downloads/cudnn-9.0-linux-x64-v7.1/cuda
sudo cp include/* /usr/local/cuda/include/
sudo cp lib64/libcudnn.so.7.1.4 lib64/libcudnn_static.a /usr/local/cuda/lib64/
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libcudnn.so.7.1.4 libcudnn.so.7
sudo ln -s libcudnn.so.7 libcudnn.so

为 CUDA 9.0 安装 NCCL 2.2.12

从https://developer.nvidia.com/nccl 下载 tar 文件并将其解压缩到您的下载文件夹下载需要nvidia开发的登录，免费注册运行以下命令：

sudo mkdir -p /usr/local/cuda/nccl/lib /usr/local/cuda/nccl/include
cd ~/Downloads/nccl-repo-ubuntu1604-2.2.12-ga-cuda9.0_1-1_amd64/
sudo cp *.txt /usr/local/cuda/nccl
sudo cp include/*.h /usr/include/
sudo cp lib/libnccl.so.2.1.15 lib/libnccl_static.a /usr/lib/x86_64-linux-gnu/
sudo ln -s /usr/include/nccl.h /usr/local/cuda/nccl/include/nccl.h
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libnccl.so.2.1.15 libnccl.so.2
sudo ln -s libnccl.so.2 libnccl.so
for i in libnccl*; do sudo ln -s /usr/lib/x86_64-linux-gnu/$i /usr/local/cuda/nccl/lib/$i; done

安装 Bazel（推荐手动安装 bazel 有效，参考：https://docs.bazel.build/versions/master/install-ubuntu.html#install-with-installer-ubuntu）

从https://github.com/bazelbuild/bazel/releases 下载“bazel-0.13.1-installer-darwin-x86_64.sh” 运行以下命令：

chmod +x bazel-0.13.1-installer-darwin-x86_64.sh
./bazel-0.13.1-installer-darwin-x86_64.sh --user
export PATH="$PATH:$HOME/bin"

编译张量流

我们将使用 CUDA 编译，使用 XLA JIT（哦，是的）和 jemalloc 作为 malloc 支持。所以我们为这些东西输入yes。运行以下命令并按照运行配置的说明回答查询

git clone https://github.com/tensorflow/tensorflow 
git checkout r1.8
./configure
You have bazel 0.13.0 installed.
Please specify the location of python. [Default is /usr/bin/python]:
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: y
XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]:
Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1.4
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.
Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: 2.2.12
Please specify the location where NCCL 2 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/local/cuda/nccl
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.0]
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/x86_64-linux-gnu-gcc-7]: /usr/bin/gcc-6
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
 --config=mkl          # Build with MKL support.

 --config=monolithic   # Config for mostly static monolithic build.

Configuration finished

现在要编译 tensorflow，运行下面的命令，这非常消耗 RAM 并且需要时间。如果您有大量 RAM，则可以从下面的行中删除“--local_resources 2048,.5,1.0”，否则这将适用于 2 GB 的 RAM

bazel build --config=opt --config=cuda --local_resources 2048,.5,1.0 //tensorflow/tools/pip_package:build_pip_package

编译完成后，您将看到如下图所示的内容，确认编译成功

构建wheel文件，运行如下：

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

使用pip安装生成的wheel文件

sudo pip install /tmp/tensorflow_pkg/tensorflow*.whl

现在要在设备上进行探索，您可以运行 tensorflow，下图是 ipython 终端上的展示

【讨论】：

谢谢，Manoj。它很好地解释了 Tensorfow 的安装。这将是很好的未来参考。
@Manoj Kumar Das 你能上传你的 .whl 文件进行编译吗？我真的很感激它
我还为 Tensorflow 1.12、CUDNN 7.2.1、NCCL: 2.2.13 构建了一个轮子。如果您需要联系我，可以在 MATLAB 和 Octave 聊天室给我发消息：chat.stackoverflow.com/rooms/81987/chatlab-and-talktave