【发布时间】:2019-05-27 13:56:21
【问题描述】:
我的代码现在很简单:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
torch.cuda.current_device()
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-20-3380d2c12118> in <module>
----> 1 torch.cuda.current_device()
~/.conda/envs/tensorflow/lib/python3.6/site-packages/torch/cuda/__init__.py in current_device()
349 def current_device():
350 r"""Returns the index of a currently selected device."""
--> 351 _lazy_init()
352 return torch._C._cuda_getDevice()
353
~/.conda/envs/tensorflow/lib/python3.6/site-packages/torch/cuda/__init__.py in _lazy_init()
161 "Cannot re-initialize CUDA in forked subprocess. " + msg)
162 _check_driver()
--> 163 torch._C._cuda_init()
164 _cudart = _load_cudart()
165 _cudart.cudaGetErrorName.restype = ctypes.c_char_p
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THC/THCGeneral.cpp:51
在互联网上查找似乎是版本问题,但我发誓我尝试了 CUDA 10.0、10.1、tensorflow-gpu 13、12 等驱动程序的所有组合,但似乎没有任何效果。
NVIDIA 驱动:nvidia-smi:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.14 Driver Version: 430.14 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce 930MX Off | 00000000:01:00.0 Off | N/A |
| N/A 36C P8 N/A / N/A | 139MiB / 2004MiB | 4% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 986 G /usr/lib/xorg/Xorg 64MiB |
| 0 1242 G /usr/bin/gnome-shell 72MiB |
+-----------------------------------------------------------------------------+
CUDA 版本nvcc --version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
tensorflow-gpu 版本:pip list | grep tensorflow:
tensorflow 1.13.1
tensorflow-estimator 1.13.0
pytorch 版本pip list | grep torch
pytorch-pretrained-bert 0.6.2
torch 1.1.0
torchvision 0.3.0
谁能看到兼容性问题并解释为什么以及如何解决它?
【问题讨论】:
-
你试过这个github.com/tensorflow/tensorflow/issues/… 吗?或者只是重新启动
-
您的问题与 tensorflow 有什么关系?您只展示了 PyTorch 的代码。
-
@BramVanroy 你说得对,我还想上传一些 tensorflow 代码,但后来我意识到没有必要。
-
@DSDS 感谢您的回答!我没有尝试,但我放弃了这个问题并使用 Google Colab 切换到云计算,所以我想我会删除这个问题。谢谢大家!
标签: python tensorflow pytorch nvidia