多物体识别opencv答案

【问题标题】：Multiple object recognition opencv多物体识别opencv
【发布时间】：2022-01-18 18:09:41
【问题描述】：

我正在解决在给定模板的图像上查找对象的问题。图片示例：

模板示例

到目前为止，我想出了以下方法：

使用一些检测到的，例如sift 寻找关键点
匹配关键点
对它们进行聚类

看起来像

sift = cv2.SIFT_create()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img,None)
kp2, des2 = sift.detectAndCompute(query,None)
# BFMatcher with default params
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1,des2,k=2)
# Apply ratio test
good = []
for m,n in matches:
    if m.distance < 0.5*n.distance:
        good.append([m])
# cv.drawMatchesKnn expects list of lists as matches.
img3 = cv2.drawMatchesKnn(img,kp1,query,kp2,good,None,flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
plt.imshow(img3)
plt.show()

结果

但我被困在这里。我如何使用这些匹配项来实际找到图像上存在的对象的 bbox。我尝试根据模板的关键点和大小创建网格：

然后使用cv2.matchTemplate 查找每个单元格周围区域的对象（窗口移动），但效果不佳。我该如何处理？

【问题讨论】：

让您的问题变得更好的两条建议：（1）您应该发布minimal reproducible example。这应该包含我们需要重现您的问题的所有代码（包括导入、文件打开等）。我们应该能够剪切和粘贴这段代码并直接运行它。 (2) 您的示例和模板图像应该是您要在其上运行代码的确切文件，而不是 imshow 窗口的屏幕截图。
我认为模板匹配可能是解决这个问题的方法。您的模板图像应该被裁剪得更多，只有您要定位的包裹的正面。有关如何通过迭代 minMaxLoc 结果来查找多个匹配项的算法，请参阅 [this answer](minimal reproducible example)。

标签： python opencv object-recognition

【解决方案1】：

我希望现在还不算晚，但最好结束这个问题。

我已经尝试开发一段代码来按照您的方法解决您的问题。

首先我创建了一个掩码来识别较白的区域。

然后，我对 HSV 颜色空间的 v 通道进行了阈值处理，并将其与另一个蒙版连接起来。

然后，我找到了掩码的所有连通分量。

然后，我计算输入图像和查询图像的 SIFT 描述符。在好的匹配上，我找到了关键点的位置，将它与该位置的连接组件链接起来。

最后一步是绘制每个连接组件的BBox，并分配一个关键点。

我尝试过cv2.matchTemplate等其他方法，但没有奏效。此外，我认为结果可能会更好，因为我必须从您的答案中截取图像，并且获得的关键点不太好。然而，饮料纸盒很难单独分割，但如果你找到更好的方法来分割它们，它会完美地工作。

希望它有效！

import cv2
import matplotlib.pyplot as plt
import numpy as np

img = cv2.imread("stack2.png")
query = cv2.imread("stack3.png")

OBJECT_WIDTH_LIMITER = 200  # Variable to delimit the max width of the BBoxes

# Obtain a mask for identifying each product
# First obtain a mask with the whitish colours
white_mask = cv2.inRange(img, (180, 180, 180), (255, 255, 255))
white_mask = white_mask.astype(float) / 255
white_mask = cv2.morphologyEx(white_mask, cv2.MORPH_OPEN, np.ones((1, 10), np.uint8))

# Transform image to hsv, threshold the v channel
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
h, s, v = cv2.split(img)
_, mask = cv2.threshold(v, 0, 1, cv2.THRESH_OTSU)

# Segment the whitests parts of the image
mask[white_mask == 1] = 0

# Apply small closing
# mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, np.ones((2, 2), np.uint8))

# Detect all the connected components
n, conComp, stats, centroids = cv2.connectedComponentsWithStats(mask)

# Create SIFT object
sift = cv2.SIFT_create()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img, None)
kp2, des2 = sift.detectAndCompute(query, None)
# BFMatcher with default params
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1, des2, k=2)
# Apply ratio test
good = []
for m, n in matches:
    if m.distance < 0.5 * n.distance:
        good.append([m])
# cv.drawMatchesKnn expects list of lists as matches.

# Iterate through each DMatch object and obtain the keypoint position of the good matches
good_keypoints = [kp1[match[0].queryIdx].pt for match in good]

# Obtain the connected components which each keypoint beongs
# If the connected component is wider than OBJECT_WIDTH_LIMITER, crop the connected component
# Create a mask with all the connected components that belong to keypoints
cc_filtered = np.zeros((img.shape[0], img.shape[1]), np.uint8)
for kp in good_keypoints:
    ccNumber = conComp[int(kp[1]), int(kp[0])]

    mask = np.zeros((img.shape[0], img.shape[1]), np.uint8)
    if ccNumber != 0:
        if int(kp[0]) - OBJECT_WIDTH_LIMITER < 0:
            left_limit = 0
        else:
            left_limit = int(kp[0]) - OBJECT_WIDTH_LIMITER

        if int(kp[0]) + OBJECT_WIDTH_LIMITER > img.shape[0]:
            right_limit = img.shape[0]
        else:
            right_limit = int(kp[0]) + OBJECT_WIDTH_LIMITER

        mask[conComp == ccNumber] = 1
        mask[:, right_limit:] = 0
        mask[:, :left_limit] = 0
        cc_filtered[mask == 1] = ccNumber

# Draw the BBoxes for each connected connected component
n, conComp, stats, centroids = cv2.connectedComponentsWithStats(cc_filtered)
for ccNumber in range(n):
    if ccNumber != 0:
        tl = (stats[ccNumber, cv2.CC_STAT_LEFT], stats[ccNumber, cv2.CC_STAT_TOP])
        br = (
            stats[ccNumber, cv2.CC_STAT_LEFT] + stats[ccNumber, cv2.CC_STAT_WIDTH],
            stats[ccNumber, cv2.CC_STAT_TOP] + stats[ccNumber, cv2.CC_STAT_HEIGHT],
        )
        cv2.rectangle(img, tl, br, (0, 255, 0), 5)

plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.show()

【讨论】：