OpenCV、PyTorch、ValueError：图块无法扩展到图像之外答案

【问题标题】：OpenCV, PyTorch, ValueError: tile cannot extend outside imageOpenCV、PyTorch、ValueError：图块无法扩展到图像之外
【发布时间】：2021-09-19 03:02:25
【问题描述】：

我收到此错误：ValueError: tile cannot extend outside image

在推断人脸识别软件检查您是否佩戴 covid 口罩时。

这是代码

    transformations = Compose([
        ToPILImage(),
        Resize((100, 100)),
        ToTensor(),
    ])
    

[...]

    for frame in vreader(str(videopath)):
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        faces = faceDetector.detect(frame)
        for face in faces:
            xStart, yStart, width, height = face
            
            # clamp coordinates that are outside of the image
            xStart, yStart = max(xStart, 0), max(yStart, 0)
            
            # predict mask label on extracted face
            faceImg = frame[yStart:yStart+height, xStart:xStart+width]
            output = model(transformations(faceImg).unsqueeze(0).to(device))
            _, predicted = torch.max(output.data, 1)
            
            # draw face frame
            cv2.rectangle(frame,
                          (xStart, yStart),
                          (xStart + width, yStart + height),
                          (126, 65, 64),
                          thickness=2)

主要问题源于这个sn-p

output = model(transformations(faceImg).unsqueeze(0).to(device))

可能是 facedetector.py 中的“检测”函数，它是一个单独的元素，仅用于查找图片中的人脸：

def detect(self, image):
    """ detect faces in image
    """
    net = self.classifier
    height, width = image.shape[:2]
    blob = blobFromImage(resize(image, (300, 300)), 1.0,
                         (300, 300), (104.0, 177.0, 123.0))
    net.setInput(blob)
    detections = net.forward()
    faces = []
    for i in range(0, detections.shape[2]):
        confidence = detections[0, 0, i, 2]
        if confidence < self.confidenceThreshold:
            continue
        box = detections[0, 0, i, 3:7] * np.array([width, height, width, height])
        startX, startY, endX, endY = box.astype("int")
        faces.append(np.array([startX, startY, endX-startX, endY-startY]))
    return faces

我正在尝试对 1280x720p 视频进行推理。不知道出了什么问题。它开始推理，并且根据我收集的模型工作，但在它进入错误之后不久......

你怎么看？

这是错误的完整堆栈

File "video.py", line 66, in tagVideo
output = model(transformations(faceImg).unsqueeze(0).to(device))

调用中的文件“C:\Users\User\Ana\anaconda3\envs\Venv\lib\site-packages\torchvision\transforms\transforms.py”，第 60 行 img = t(img) 调用中的文件“C:\Users\User\Ana\anaconda3\envs\Venv\lib\site-packages\torchvision\transforms\transforms.py”，第 179 行 return F.to_pil_image(pic, self.mode) 文件“C:\Users\User\Ana\anaconda3\envs\Venv\lib\site-packages\torchvision\transforms\functional.py”，第 292 行，在 to_pil_image 返回 Image.fromarray（npimg，模式=模式）文件“C:\Users\User\Ana\anaconda3\envs\Venv\lib\site-packages\PIL\Image.py”，第 2793 行，在 fromarray 从缓冲区返回（模式、大小、obj、“原始”、原始模式、0、1）文件“C:\Users\User\Ana\anaconda3\envs\Venv\lib\site-packages\PIL\Image.py”，第 2733 行，在 frombuffer 中从字节返回（模式、大小、数据、解码器名称、参数）文件“C:\Users\User\Ana\anaconda3\envs\Venv\lib\site-packages\PIL\Image.py”，第 2679 行，in frombytes im.frombytes（数据，解码器名称，参数）文件“C:\Users\User\Ana\anaconda3\envs\Venv\lib\site-packages\PIL\Image.py”，第 796 行，in frombytes d.setimage(self.im)

【问题讨论】：

你传递给模型的张量的形状是什么？我建议检查一下。同样在您的转换中，您为什么要转换为 PIL Image。这可能会导致调整大小问题。
@SarthakJain 张量的形状是 (C x H x W) 我正在学习一个对 tensorflow 来说相当新的教程。你对改进这个有什么建议吗？太感谢了
我发现这个答案似乎很有帮助的解决方法：github.com/python-pillow/Pillow/issues/…。此外，我认为打印出检测结果并查看其值如何在 PIL 图像上描绘有助于调试，以防您想检查是否对输出值进行了错误的后处理。

标签： python tile

【解决方案1】：

当您有无效的 bboxes 或数组值（如果是分段）并因此不能用于索引图像时会导致错误。

例如，像 [10, 20, 30, 40] 这样的 bbox 可以正常工作，但是像 [10, 20, 30, 40] 这样的 bbox [10, -5, 30, 40] 不会因为负值。

同样传入 [] 之类的 bbox 也会导致此错误。

因此，我建议您打印您的 bbox，看看您是否得到了这样的意外数组。

萨塔克·耆那教

【讨论】：

我得到了一个这样大小的 bbox：[2644 1513 988 631] 不知道有什么问题。
嗨，你的 bbox 比你的图片大吗？您的 bbox 中的任何值是否大于图像的高度或宽度。