TensorFlow - 在读取和写入 TFRecords 文件时设置图像的形状？答案

【问题标题】：TensorFlow - Setting the shape of an image while reading and writing a TFRecords file?TensorFlow - 在读取和写入 TFRecords 文件时设置图像的形状？
【发布时间】：2016-03-13 19:36:31
【问题描述】：

在尝试使用 TFRecords 格式时，我在设置图像数据的形状时遇到了问题。我一直在查看how-to for reading data 并从MNIST 示例中获取converting the image data to a TFRecords 和reading the data from the TFRecords 的代码。但是，此示例代码最初希望以所有像素数据都在一个长向量中的格式使用图像。

我一直在尝试更改此代码以使用仍处于原始图像形状的 NumPy 数组。所以在我下面的代码中，images 是一个形状为[number_of_images, height, width, channels] 的 NumPy 数组。我不确定我的问题是否在于我如何将数据写入 TFRecords 或者我如何将其读回。但是，当我尝试设置解码图像的形状时，我收到错误ValueError: Shapes (?,) and (464, 624, 3) must have the same rank（注意：464 x 624 x 3 是图像尺寸）。关于我可能做错的任何建议？

相关代码（与示例代码略有改动）：

def convert_to_tfrecord(images, labels, name, data_directory):
    number_of_examples = labels.shape[0]
    rows = images.shape[1]  # images is the 4D ndarray with the images in their original shape.
    cols = images.shape[2]
    depth = images.shape[3]
    ...
    for index in range(number_of_examples):
        image_raw = images[index].tostring()
        example = tf.train.Example(features=tf.train.Features(feature={
            'height': _int64_feature(rows),
            'width': _int64_feature(cols),
            'channels': _int64_feature(depth),
            'image': _bytes_feature(image_raw),
            ...
        }))
        writer.write(example.SerializeToString())

...

def read_and_decode(filename_queue):
    ...
    features = tf.parse_single_example(
        serialized_example,
        features={
            'image_raw': tf.FixedLenFeature([], tf.string),
            ...
        })
    ...
    image = tf.decode_raw(features['image_raw'], tf.uint8)
    image.set_shape([464, 624, 3])  # This is where the error occurs.
    image = tf.cast(image, tf.float32) * (1. / 255) - 0.5
    ...

【问题讨论】：

标签： python numpy tensorflow

【解决方案1】：

请注意，set_shape 不会改变底层缓冲区的形状，它只是设置在此张量处可以看到的可能形状集的图形级注释。

要更改您需要使用的实际形状tf.reshape

【讨论】：

那么图像数据在被推入字节特征时会变平吗？换句话说，字节特征是否不保留任何形状信息，这需要我重新塑造图像？
是的，形状丢失了。编码它的人应该将形状信息添加为额外的特征。可能是image/height、image/width 或image_height、image_width
@YaroslavBulatov，我确实将高度和宽度信息编码到了 TFRecords 文件中，但是，当我使用height = features['height'] 时，我发现高度是一个张量，它不能输入到 set_shape() 中。如何正确获取高度的值？
set_shape 旨在将静态形状编码为 Graph，即在 session.run 调用之间不会改变的东西。如果您知道所有图像的形状都相同，则可以通过 actual_height=sess.run(height) 获取第一个形状，然后将 actual_height 用于 set_shape