PIL图像到numpy数组的Python转换非常慢答案

【问题标题】：Python conversion of PIL image to numpy array very slowPIL图像到numpy数组的Python转换非常慢
【发布时间】：2019-02-26 10:28:51
【问题描述】：

我正在评估开放式 cv 视频帧上的 Tensorflow 模型。我需要将传入的 PIL 图像重塑为重塑的 numpy 数组，以便我可以对其进行推理。但是我看到在我的笔记本电脑上，将 PIL 图像转换为 numpy 数组大约需要 900 多毫秒，它具有 16 GiB 内存和 2.6 GHz Intel Core i7 处理器。我需要把它降低到几毫秒，这样我就可以在我的相机上每秒处理多个帧。

谁能建议如何使下面的方法运行得更快？

def load_image_into_numpy_array(pil_image):
    (im_width, im_height) = pil_image.size
    data = pil_image.getdata()

    data_array = np.array(data)

    return data_array.reshape((im_height, im_width, 3)).astype(np.uint8)

在进一步的检测中，我意识到np.array(data) 占用了大部分时间......接近 900 多毫秒。所以将图像数据转换为 numpy 数组才是真正的罪魁祸首。

【问题讨论】：

您在哪一步获得了 PIL 图像？
我更早地获取了 PIL 图像并将其传递给此函数。
图片有多大？
(720, 1280, 3)

标签： python numpy opencv tensorflow computer-vision

【解决方案1】：

您可以让 numpy 处理转换，而不是重新塑造自己。

def pil_image_to_numpy_array(pil_image):
    return np.asarray(pil_image)

您正在将图像转换为（高度、宽度、通道）格式。这是对 PIL 图像执行的默认转换 numpy.asarray 函数，因此不需要显式重塑。

【讨论】：

感谢您的解决方案。需要进行整形，以便我可以得到正确形状的图像以进行推理......检查github.com/tensorflow/models/blob/master/research/…中的load_image_into_numpy_array方法@
我不确定为什么需要显式转换，您正在将图像转换为（高度、宽度、通道）格式。当您使用默认的 numpy 方法将 PIL 图像转换为 numpy 数组时，您已经在（高度、宽度、通道）中获得了 numpy 数组。抱歉，如果我在这里遗漏了什么。
哦，好的...感谢您提供的信息。我只是在关注 tensorflow 的人在做什么......顺便说一句 np.asarray(pil_image) 超级快......只需要 1 毫秒......
有趣的是，当我打印pil_image.size 时，我得到(1280, 720)，但是当我打印np.asarray(pil_image).size 时，我得到(720, 1280, 3) .. 我想知道我是否将高度和宽度颠倒了，这可能导致不正确结果
PIL 使用 (width, height) 顺序 pillow.readthedocs.io/en/3.1.x/reference/Image.html#attributes，而 numpy 使用 (height, width) 此处提到的 stackoverflow.com/questions/43272848/…。

【解决方案2】：

非常感谢！！它的工作速度非常快！

def load_image_into_numpy_array(path):
    """Load an image from file into a numpy array.

    Puts image into numpy array to feed into tensorflow graph.
    Note that by convention we put it into a numpy array with shape
    (height, width, channels), where channels=3 for RGB.

    Args:
    path: a file path (this can be local or on colossus)

    Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
    """
    img_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(img_data))

    return np.array(image)

带有 (3684, 4912, 3) 的图像需要 0.3~0.4 秒。

【讨论】：