TensorFlow 对象检测 API 训练问题答案

【问题标题】：Tensorflow Object Detection API training issuesTensorFlow 对象检测 API 训练问题
【发布时间】：2017-11-27 17:42:20
【问题描述】：

我正在使用 Paperspace 进行培训，但我遇到了一些我以前从未见过的问题。我以前用过同一台机器，没有任何问题。培训似乎还没有开始。我已将批量大小减少到 10（默认为 24）。

有其他人遇到过这个问题吗？

这是我在 models/research/object_detection 中运行 train.py 时得到的输出，它已经运行了大约一个小时。

WARNING:tensorflow:From /home/paperspace/Documents/models/research/object_detection/trainer.py:210: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-11-27 12:08:46.994554: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2
2017-11-27 12:08:47.109823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-27 12:08:47.110204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: Quadro P4000 major: 6 minor: 1 memoryClockRate(GHz): 1.48
pciBusID: 0000:00:05.0
totalMemory: 7.92GiB freeMemory: 7.60GiB
2017-11-27 12:08:47.110230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro P4000, pci bus id: 0000:00:05.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from ssd_mobilenet_v1_coco_11_06_2017/model.ckpt
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt

【问题讨论】：

标签： python-3.x tensorflow

【解决方案1】：

我认为你没有生成 tf 记录文件，请在研究文件夹 generatetf.record 文件中检查它是否是训练和测试文件。如果不是他们先生成它，则从训练文件夹中删除除模型（faster_rcnn）和label.pbtxt 文件之外的所有文件，然后开始训练！

【讨论】：