【问题标题】:Release GPU memory after process killed进程终止后释放 GPU 内存
【发布时间】:2020-07-20 08:13:12
【问题描述】:

我正在尝试使用 PyCharm 和 jupyter shell 的 tensorflow1.10 代码。

当我在运行一些代码后重新启动内核时,我遇到了这样的错误。

WARNING:root:kernel 7ee39326-4723-4562-a82e-d651dc4710d7 restarted
Traceback (most recent call last):
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/site-packages/traitlets/config/application.py", line 663, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-124>", line 2, in initialize
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 547, in initialize
    self.init_sockets()
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 266, in init_sockets
    self.shell_port = self._bind_socket(self.shell_socket, self.shell_port)
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 213, in _bind_socket
    return self._try_bind_socket(s, port)
  File "/home/jho/anaconda3/envs/mask/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 189, in _try_bind_socket
    s.bind("tcp://%s:%i" % (self.ip, port))
  File "zmq/backend/cython/socket.pyx", line 550, in zmq.backend.cython.socket.Socket.bind
  File "zmq/backend/cython/checkrc.pxd", line 25, in zmq.backend.cython.checkrc._check_rc
zmq.error.ZMQError: Address already in use
[W 09:57:18.673 NotebookApp] KernelRestarter: restart failed
[W 09:57:18.674 NotebookApp] Kernel 7ee39326-4723-4562-a82e-d651dc4710d7 died, removing from map.
ERROR:root:kernel 7ee39326-4723-4562-a82e-d651dc4710d7 restarted failed!

我认为这是由于 ImageZMQ 没有重新启动造成的。 所以我想运行另一个代码,但是我的 GPU 内存没有被释放。

这是我的 nvidia-smi。

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2070    On   | 00000000:01:00.0  On |                  N/A |
| 30%   34C    P2    33W / 225W |   7953MiB /  7979MiB |      7%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1169      G   /usr/lib/xorg/Xorg                            18MiB |
|    0      1208      G   /usr/bin/gnome-shell                          50MiB |
|    0      1537      G   /usr/lib/xorg/Xorg                            91MiB |
|    0      1670      G   /usr/bin/gnome-shell                          45MiB |
|    0      1994      G   gnome-control-center                           2MiB |
|    0      2570      G   ...p/pycharm-professional/192/jbr/bin/java    70MiB |
+-----------------------------------------------------------------------------+

Python 进程被杀死,但全局内存没有被释放。

有办法释放吗?

【问题讨论】:

    标签: tensorflow cuda gpu jupyter nvidia


    【解决方案1】:

    在使用 keras 的情况下,在库之后添加“K.clear_session()”,这会清除内存中的所有内容

    【讨论】:

      猜你喜欢
      • 2020-08-16
      • 1970-01-01
      • 2018-09-07
      • 2010-10-04
      • 1970-01-01
      • 2021-07-30
      • 2016-01-01
      • 2021-01-14
      • 2019-12-10
      相关资源
      最近更新 更多