在tensorflow中克服Graphdef不能大于2GB答案

【问题标题】：overcome Graphdef cannot be larger than 2GB in tensorflow在tensorflow中克服Graphdef不能大于2GB
【发布时间】：2016-04-01 05:56:48
【问题描述】：

我正在使用 tensorflow 的 imageNet trained model 来提取最后一个池化层的特征作为新图像数据集的表示向量。

模型在新图像上的预测如下：

python classify_image.py --image_file new_image.jpeg

我编辑了主函数，以便我可以获取一个图像文件夹并立即返回所有图像的预测并将特征向量写入 csv 文件。我是这样做的：

def main(_):
  maybe_download_and_extract()
  #image = (FLAGS.image_file if FLAGS.image_file else
  #         os.path.join(FLAGS.model_dir, 'cropped_panda.jpg'))
  #edit to take a directory of image files instead of a one file
  if FLAGS.data_folder:
    images_folder=FLAGS.data_folder
    list_of_images = os.listdir(images_folder)
  else: 
    raise ValueError("Please specify image folder")

  with open("feature_data.csv", "wb") as f:
    feature_writer = csv.writer(f, delimiter='|')

    for image in list_of_images:
      print(image) 
      current_features = run_inference_on_image(images_folder+"/"+image)
      feature_writer.writerow([image]+current_features)

它对大约 21 张图像运行良好，但随后因以下错误而崩溃：

  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1912, in as_graph_def
    raise ValueError("GraphDef cannot be larger than 2GB.")
ValueError: GraphDef cannot be larger than 2GB.

我认为通过调用方法run_inference_on_image(images_folder+"/"+image) 会覆盖以前的图像数据以仅考虑新的图像数据，但似乎并非如此。如何解决这个问题？

【问题讨论】：

标签： python tensorflow

【解决方案1】：

这里的问题是每次调用run_inference_on_image()都会在同一个图中添加个节点，最终超过了最大尺寸。至少有两种方法可以解决这个问题：

简单但缓慢的方法是为每次调用 run_inference_on_image() 使用不同的默认图表：

for image in list_of_images:
  # ...
  with tf.Graph().as_default():
    current_features = run_inference_on_image(images_folder+"/"+image)
  # ...

更多参与但更有效的方法是修改run_inference_on_image() 以在多个图像上运行。将您的 for 循环重新定位到 this sess.run() call 周围，您将不再需要在每次调用时重建整个模型，这样可以更快地处理每个图像。

【讨论】：

我选择了第二个选项，它更快。感谢您的想法！
一个问题，有没有办法在 sess.run predictions = sess.run(pool_3_tensor, {'DecodeJpeg/contents:0': image_data})987654328@ 的预测部分传递一组图像而不是只传递一个图像
我认为特定的馈送点仅适用于单个图像。可以更改图表以获取一批图像，但这需要创建一个预取线程（使用例如tf.train.batch()）将图像组合成一个批次（必须全部具有相同的大小），然后馈入网络中稍晚一点的点。您必须使用 input_map 到 tf.import_graph_def() 的参数来更改用作输入的张量。由于该特定图表的结构没有记录，但它可能具有挑战性......
您好，您能解释一下环绕 sess.run 调用是什么意思吗？我尝试了许多不同的变化，但我仍然遇到同样的错误。谢谢！ :)
我的意思是“在 for 循环中调用 sess.run() 的实例”。您必须进行一些重组，以便拥有要循环的图像列表，但更改应该不会太难。

【解决方案2】：

您可以将create_graph() 移动到此循环for image in list_of_images:（循环文件）之前之前的某个位置。

它的作用是在同一张图上多次执行推理。

【讨论】：

为了清楚起见，你能举个例子吗？谢谢。

【解决方案3】：

最简单的方法是把create_graph()放在main函数的第一个。然后，它只创建图表

【讨论】：

【解决方案4】：

很好地解释了为什么提到此类错误here，我在使用 tf dataset api 时遇到了同样的错误，并且了解到数据在会话中迭代时会附加到现有图表上。所以我所做的是在数据集迭代器之前使用tf.reset_default_graph() 以确保清除之前的图形。

希望这对这种情况有所帮助。

【讨论】：