实时使用 cv2.CascadeClassifier() 时显示图像的延迟答案

【问题标题】：delay of shown images when using cv2.CascadeClassifier() in real-time实时使用 cv2.CascadeClassifier() 时显示图像的延迟
【发布时间】：2021-11-14 06:54:14
【问题描述】：

我正在为 Tello 无人机开发一个 ROS 项目，我使用 this 驱动程序。当我只是订阅来自无人机摄像头的CompressedImage 消息并在屏幕上显示图像时，我没有任何问题，一切正常。

但是，一旦我尝试对cv2.CascadeClassifier 使用面部检测，帧就会实时延迟大约 30 秒。因此，图像仅在大约 30 秒后显示在屏幕上。有谁知道如何最大限度地减少这种延迟以获得实时的良好结果？

这是目前为止的代码：

#!/usr/bin/env python

import rospy
from sensor_msgs.msg import CompressedImage
import av
import cv2
import numpy
import threading
import traceback


class StandaloneVideoStream(object):
    def __init__(self):
        self.cond = threading.Condition()
        self.queue = []
        self.closed = False

    def read(self, size):
        self.cond.acquire()
        try:
            if len(self.queue) == 0 and not self.closed:
                self.cond.wait(2.0)
            data = bytes()
            while 0 < len(self.queue) and len(data) + len(self.queue[0]) < size:
                data = data + self.queue[0]
                del self.queue[0]
        finally:
            self.cond.release()
        return data

    def seek(self, offset, whence):
        return -1

    def close(self):
        self.cond.acquire()
        self.queue = []
        self.closed = True
        self.cond.notifyAll()
        self.cond.release()

    def add_frame(self, buf):
        self.cond.acquire()
        self.queue.append(buf)
        self.cond.notifyAll()
        self.cond.release()


stream = StandaloneVideoStream()


def callback(msg):
    stream.add_frame(msg.data)


def main():
    rospy.init_node('face_detection')

    rospy.Subscriber('/tello/image_raw/h264', CompressedImage, callback)

    container = av.open(stream)

    for frame in container.decode(video=0):
        image_msg = cv2.cvtColor(numpy.array(frame.to_image()), cv2.COLOR_RGB2BGR) 

        stop_data = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
        found = stop_data.detectMultiScale(image_msg, minSize =(20, 20))

        amount_found = len(found)

        if amount_found != 0:
            for (x, y, width, height) in found:  
                cv2.rectangle(image_msg, (x, y), (x + height, y + width), (0, 255, 0), 5)

        cv2.imshow('Frame', image_msg)
        cv2.waitKey(1)


if __name__ == '__main__':
    try:
        main()
    except BaseException:
        traceback.print_exc()
    finally:
        stream.close()
        cv2.destroyAllWindows()

编辑：

当我打印出图像的形状 (image_msg) 时，我得到了 (720, 960, 3) 高度、宽度和 3 个通道的尺寸

这显示了流的大小（以字节为单位）

...
    Tello: 15:54:16.106:  Info: video data 599118 bytes 290.2KB/sec
    Tello: 15:54:18.106:  Info: video data 502212 bytes 245.2KB/sec
    Tello: 15:54:20.108:  Info: video data 503748 bytes 245.7KB/sec
    Tello: 15:54:22.109:  Info: video data 503182 bytes 245.6KB/sec
    Tello: 15:54:22.446:  Info: video recv: 1460 bytes 1b00 +103
    Tello: 15:54:22.813:  Info: video recv: 1460 bytes 2400 +173
    Tello: 15:54:23.190:  Info: video recv: 1460 bytes 2f00 +177
    Tello: 15:54:23.554:  Info: video recv: 1460 bytes 3a00 +178
    Tello: 15:54:23.918:  Info: video recv: 1460 bytes 4500 +176
    Tello: 15:54:24.268:  Info: video recv: 1460 bytes 5000 +160
    Tello: 15:54:24.268:  Info: video data 502157 bytes 227.1KB/sec
    Tello: 15:54:24.585:  Info: video recv: 1460 bytes 5c00 +140
    Tello: 15:54:24.917:  Info: video recv: 1460 bytes 6600 +142
    Tello: 15:54:25.266:  Info: video recv: 1460 bytes 7000 +157
    Tello: 15:54:25.545:  Info: video recv: 1460 bytes 7a00 +102
    Tello: 15:54:25.878:  Info: video recv: 1460 bytes 8201 +140
    Tello: 15:54:26.178:  Info: video recv: 1460 bytes 8d00 +102
    Tello: 15:54:26.271:  Info: video data 534194 bytes 260.5KB/sec
...

【问题讨论】：

进来的图片尺寸是多少？
@BTables 我更新了我的问题。感谢您的帮助！
你想在什么硬件上运行这个节点？
@BTables 我正在使用具有 16 GB RAM、AMD Ryzen 5 4600H、512 GB PCIe SSD 的笔记本电脑。感谢你的宝贵时间！昨天，我还尝试将上面的代码拆分为两个单独的节点（一个节点用于帧处理，一个节点用于图像处理）所以我额外编写了一个订阅者和发布者。我将帧处理节点上的图像从 numpy 数组转换为带有ros_numpy.msgify(Image, image_msg, encoding='rgb8') 的 ROS 图像，以将它们发布并返回到我拥有订阅者的图像处理节点。但这没有任何区别。有同样的延迟。
@cv576 在您的问题中也包含硬件详细信息。这将有助于澄清，没有人必须通过 cmets。

标签： python ros cv2 face-detection rospy

【解决方案1】：

如果此代码的循环时间为 30 秒，并且注释掉的“stop_data =”和“found =”行明显更快，那么这就是瓶颈。您有 3 个选项（按严重性）：1）更改参数，2）更改输入数据，3）更改算法。我假设您已尝试 (1) 更改参数并且您不想 (3) 更改算法，因此您唯一的选择是 (2) 更改输入数据。

尝试将图像缩小到 240x360 或 480x720 之类的分辨率。它应该更快。你可以使用cv2.pyrDown() (c++ doc)(Python ex)，这是一种高斯平滑下采样，它可以使图像比简单的pick-every-nth-pixel下采样更平滑。

【讨论】：