在 TensorFlow 中检测损坏的图像答案

【问题标题】：Detecting corrupt images in Tensorflow在 TensorFlow 中检测损坏的图像
【发布时间】：2021-10-13 03:18:53
【问题描述】：

我无法在数据集中找到一些有问题的图像。

我的模型开始训练，但出现以下错误：

tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid PNG data, size 135347
         [[{{node case/cond/cond_jpeg/decode_image/cond_jpeg/cond_png/DecodePng}} = DecodePng[channels=3, dtype=DT_UINT8, _device="/device:CPU:0"](case/cond/cond_jpeg/decode_image/cond_jpeg/cond_png/cond_gif/DecodeGif/Switch:1, ^case/Assert/AssertGuard/Merge)]]
         [[node IteratorGetNext (defined at object_detection/model_main.py:105)  = IteratorGetNext[output_shapes=[[24], [24,300,300,3], [24,2], [24,3], [24,100], [24,100,4], [24,100,2], [24,100,2], [24,100], [24,100], [24,100], [24]], output_types=[DT_INT32, DT_FLOAT, DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_BOOL, DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](IteratorV2)]]

因此，我编写了一个小脚本，在生成 TFRecord 之前运行该脚本以尝试捕获任何有问题的图像。这基本上是教程代码，但批量大小为 1。这是我能想到的尝试捕获错误的最简单方法。

def preprocess_image(image):
    image = tf.image.decode_png(image, channels=3)
    image = tf.image.resize_images(image, [192, 192])
    image /= 255.0  # normalize to [0,1] range

    return image

def load_and_preprocess_image(path):
    image = tf.read_file(path)
    return preprocess_image(image)

mobile_net = tf.keras.applications.MobileNetV2(input_shape=(192, 192, 3), include_top=False)
mobile_net.trainable=False

path_ds = tf.data.Dataset.from_tensor_slices(images)

image_ds = path_ds.map(load_and_preprocess_image, num_parallel_calls=4)

def change_range(image):
    return (2*image-1)

keras_ds = image_ds.map(change_range)
keras_ds = keras_ds.batch(1)

for i, batch in tqdm(enumerate(iter(keras_ds))):
    try:
        feature_map_batch = mobile_net(batch)
    except KeyboardInterrupt:
        break
    except:
        print(images[i])

这会正常崩溃，但没有正确处理异常。它只是抛出异常并崩溃。所以两个问题：

有什么方法可以强制它正确处理吗？好像没有Tensorflow, try and except doesn't handle exception
有没有更好的方法来查找损坏的输入？

我隔离了一个失败的图像，但 OpenCV、SciPy、Matplotlib 和 Skimage 都打开了它。例如，我试过这个：

import scipy
images = images[1258:]
print(scipy.misc.imread(images[0]))

import matplotlib.pyplot as plt
print(plt.imread(images[0]))

import cv2
print(cv2.imread(images[0]))

import skimage
print(skimage.io.imread(images[0]))

... try to run inference in Tensorflow

我打印出四个矩阵。我假设这些库都在使用 libpng 或类似的东西。

图像 1258 然后使 Tensorflow 崩溃。查看 DecodePng source，看起来它实际上正在崩溃 TF png library。

我意识到我可能会编写自己的数据加载器，但这似乎是个废话。

编辑：

这也可以用作 sn-p：

tf.enable_eager_execution()

for i, image in enumerate(images):
    try:
        with tf.gfile.GFile(image, 'rb') as fid:
            image_data = fid.read()

        image_tensor = tf.image.decode_png(
                        image_data,
                        channels=3,
                        name=None
                    )
    except:
        print("Failed: ", i, image_tensor)

【问题讨论】：

你试过decode_raw而不是decode_png吗？
这可以加载文件，但它不是图像，它将原始字节作为一维张量加载。
img = tf.decode_raw('image_raw', tf.uint8) img = tf.reshape(img, img_shape) 这行得通吗？
不，因为 PNG 图像是压缩的（另外还有一个标题）。数组的形状每次都会改变 - 在图像加载之前，img_shape 是未知的。
我很确定这应该可行，尤其是在转换为记录时明确存储图像形状的情况下。 example_features = {'height': tf.FixedLenFeature((), tf.int64) 另外我想你可以尝试用tf.Assert 来捕捉这个

标签： python tensorflow

【解决方案1】：

打开一个新的 python 文件。复制下面的代码。指定您的图片所在的目录。并运行代码。您可以在列表中看到Corrupt JPEG data: premature end of data segment 消息（如果您有损坏的文件）。

from os import listdir
import cv2

#for filename in listdir('C:/tensorflow/models/research/object_detection/images/train'):
for filename in listdir(yourDirectory):
  if filename.endswith(".jpg"):
    print(yourDirectory+filename)
    #cv2.imread('C:/tensorflow/models/research/object_detection/images/train/'+filename)
    cv2.imread(yourDirectory+filename)

【讨论】：

【解决方案2】：

对这个问题的一个相当晚且出乎意料的自我回答。

这个问题原来是（很可能）坏的 RAM。在 Linux 中发生了一些奇怪的事情之后，比如文件系统变为只读并且 Firefox 中的随机选项卡崩溃，我决定运行 Memtest。我安装了 2x8GB DIMM。结果发现在 4GB 标记附近有一个坏块（在两个棒上），这意味着错误只会在 (a) 系统处于相当高的负载时和 (b) 如果它超过大约 8GB 利用率时弹出。我还检查了诸如坏硬盘之类的东西，但它是一个相当新的 SSD。我以前在使用相同系统的 Windows 上进行过非常零星和随机的重启，但我再次认为这只是 Microsoft 强制更新。

所以我把这个贴在这里以供后代使用。如果您看到奇怪的事情，例如图像以不可重复的方式损坏，则运行 Memtest 作为健全性检查需要几分钟。严重错误应在 30 秒内弹出，值得一夜之间运行（多次通过）仔细检查。

上面发布的解决方案仍然有用，我仍然不相信 TF 滚动他们自己的 PNG 加载器，但总是值得检查你的硬件！

【讨论】：

谢谢?也有这个问题，训练过程中图像损坏。但是，如果我之后搜索该图像，则该图像仍然在磁盘上损坏。你认为它仍然可以是 RAM 吗？
请注意，这不是一个解决方案，它是一个特例。