【发布时间】:2017-04-27 15:33:38
【问题描述】:
我有兴趣修改 the tensorflow implementation of Show and Tell,尤其是 this v0.12 snapshot,以便接受 numpy 形式的图像,而不是从磁盘读取它。
使用上游代码加载文件名会在
之后生成一个 python 字符串with tf.gfile.GFile(filename, "r") as f:
image = f.read()
在run_inference.py 中,然后变成一个没有形状的ndarray。但是,我无法复制它。
我尝试了以下方法:
直接加载numpy数组
我编写了这个函数来从文件名加载枕头图像,将图像转换为 numpy 数组并将其提供给 run_inference.py 中的 beam_search 函数
def load_image(filename):
from keras.preprocessing.image import img_to_array
arr = img_to_array(PILImage.open(filename))
return arr
...
captions = generator.beam_search(sess, image)
在这种情况下,稍后会出现大小不匹配,导致以下堆栈跟踪:
Traceback (most recent call last):
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
tf.app.run()
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
captions = generator.beam_search(sess, image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
initial_state = self.model.feed_image(sess, encoded_image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
feed_dict={"image_feed:0": encoded_image})
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 943, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (960, 640, 3) for Tensor u'image_feed:0', which has shape '()'
Process finished with exit code 1
我能否以某种方式欺骗 numpy 使其认为数组没有形状?
转换为 tf.string
这里我使用了以下函数
def encode_image(filename):
g2 = tf.Graph()
from keras.preprocessing.image import img_to_array
with g2.as_default() as g:
with g.name_scope("g2") as g2_scope:
arr = img_to_array(PILImage.open(filename))
image = tf.image.encode_jpeg(arr)
return image
...
captions = generator.beam_search(sess, image)
这也不起作用:
Traceback (most recent call last):
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 107, in <module>
tf.app.run()
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/run_inference.py", line 97, in main
captions = generator.beam_search(sess, image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_utils/caption_generator.py", line 142, in beam_search
initial_state = self.model.feed_image(sess, encoded_image)
File "/home/pmelissi/repos/tensorflow-models/im2txt/bazel-bin/im2txt/run_inference.runfiles/im2txt/im2txt/inference_wrapper.py", line 41, in feed_image
feed_dict={"image_feed:0": encoded_image})
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/home/pmelissi/miniconda2/envs/im2txt/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 924, in _run
raise TypeError('The value of a feed cannot be a tf.Tensor object. '
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
这个堆栈跟踪的最后一行似乎很有帮助,但是没有关于预期什么样的结构的文档
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
那么,一个有效的输入应该是什么样的?预处理的内部结构对我来说不是特别清楚。
感谢您的宝贵时间!
编辑:Attached gist of the modified inference script for the big picture
编辑 2: sess.run 的路径是这样的:
1:run_inference.py
captions = generator.beam_search(sess, image)
2:caption_generator.py
def beam_search(self, sess, encoded_image):
initial_state = self.model.feed_image(sess, encoded_image)
3:inference_wrapper.py
def feed_image(self, sess, encoded_image):
initial_state = sess.run(fetches="lstm/initial_state:0",
feed_dict={"image_feed:0": encoded_image})
return initial_state
编辑 3:我忘了提到我仅限于 TensorFlow v0.12,因此我使用的是this snapshot of the im2txt repo。
【问题讨论】:
-
将它作为一个 numpy 数组提供是正确的,看起来你没有正确设置模型图中的数组大小(或者它可能在之前和更改没有做现在这一步)。你如何设置图表?当您调用
sess.run(...)时,代码是什么样的?看起来 tensorflow 只是不知道预期的维度。 -
I've uploaded a gist 带有更新的代码。 93-96 行是唯一改变行为的东西。当我保留 93 和 94(原始代码)并注释掉 95 和 96 时,该代码有效,但在任何其他情况下都无效。问题是通常 np_val.shape 和 subfeed_t.get_shape() 都是 ()。谢谢!
标签: python arrays numpy tensorflow