【问题标题】:Run Tensorflow object detecion with Nvidia GTX650 Ti (2GB)?使用 Nvidia GTX 650 Ti (2GB) 运行 Tensorflow 对象检测?
【发布时间】:2017-09-13 20:11:10
【问题描述】:

有没有办法让 2GB 显卡运行对象检测? 主板上有 24GB DD3 Ram,GPU 不能也用吗?

我确实尝试在 trainer.py 中添加 session_config.gpu_options.allow_growth=True 但这没有帮助。 看来显卡内存不够了。

卡片信息:

0, name: GeForce GTX 650, pci bus id: 0000:01:00.0)
[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4876955943962853047
, name: "/gpu:0"
device_type: "GPU"
memory_limit: 1375862784
locality {
  bus_id: 1
}
incarnation: 4236842880144430162
physical_device_desc: "device: 0, name: GeForce GTX 650, pci bus id: 0000:01:00.0"
]

train.py 输出:

Limit:                   219414528
InUse:                   192361216
MaxInUse:                192483072
NumAllocs:                    6030
MaxAllocSize:              6131712

2017-09-13 13:47:13.429510: W tensorflow/core/common_runtime/bfc_allocator.cc:277] ****************************************************************************************____________
2017-09-13 13:47:13.481829: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InternalError'>, Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
2017-09-13 13:47:13.955327: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_299 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3432_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
2017-09-13 13:47:13.956056: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_299 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3432_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1327, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1306, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 198, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "train.py", line 194, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "/home/dee/Documents/projects/tensor/models/object_detection/trainer.py", line 297, in train
    saver=saver)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 755, in train
    sess, train_op, global_step, train_step_kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 488, in train_step
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.

【问题讨论】:

    标签: object-detection tensorflow-gpu low-memory


    【解决方案1】:

    确实,Dst tensor is not initialized 消息表明您的 GPU 内存不足。您可以尝试将批量大小降至最低,同时降低您输入模型的图像的分辨率。也尝试使用 SSD Mobilenet 模型,因为它非常轻量级。

    要回答问题的第二部分: 我一直认为现代 GPU 将进入混合模式,其中驱动程序/GPU 开始通过 PCIe 总线从系统 RAM 流式传输资源,以弥补“缺失”的 VRAM。由于系统 RAM 比 GDDR5 慢 3-5 倍,延迟更高,用完 VRAM 将转化为显着的性能损失。但是,我在配备 6GB VRAM 的 GTX 1060 上遇到了同样的问题,其中 CUDA 进程因为 GPU 用完而崩溃。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-03-12
      • 2019-02-14
      • 2019-10-25
      • 2021-10-05
      相关资源
      最近更新 更多