使用opencv查找包含另一个图像的最相似图像答案

【问题标题】：Using opencv to find the most similar image that contains another image使用opencv查找包含另一个图像的最相似图像
【发布时间】：2021-07-29 23:46:20
【问题描述】：

如果标题不清楚，假设我有一个图像列表 (10k+)，并且我有一个正在搜索的目标图像。

这是目标图像的示例：

这是我想要搜索以查找“相似”（ex1、ex2 和 ex3）的图像示例：

这是我做的匹配（我使用 KAZE）

from matplotlib import pyplot as plt
import numpy as np
import cv2
from typing import List
import os
import imutils


def calculate_matches(des1: List[cv2.KeyPoint], des2: List[cv2.KeyPoint]):
    """
    does a matching algorithm to match if keypoints 1 and 2 are similar
    @param des1: a numpy array of floats that are the descriptors of the keypoints
    @param des2: a numpy array of floats that are the descriptors of the keypoints
    @return:
    """
    # bf matcher with default params
    bf = cv2.BFMatcher(cv2.NORM_L2)
    matches = bf.knnMatch(des1, des2, k=2)
    topResults = []
    for m, n in matches:
        if m.distance < 0.7 * n.distance:
            topResults.append([m])

    return topResults


def compare_images_kaze():
    cwd = os.getcwd()
    target = os.path.join(cwd, 'opencv_target', 'target.png')
    images_list = os.listdir('opencv_images')
    for image in images_list:
        # get my 2 images
        img2 = cv2.imread(target)
        img1 = cv2.imread(os.path.join(cwd, 'opencv_images', image))
        for i in range(0, 360, int(360 / 8)):
            # rotate my image by i
            img_target_rotation = imutils.rotate_bound(img2, i)

            # Initiate KAZE object with default values
            kaze = cv2.KAZE_create()
            kp1, des1 = kaze.detectAndCompute(img1, None)
            kp2, des2 = kaze.detectAndCompute(img2, None)
            matches = calculate_matches(des1, des2)

            try:
                score = 100 * (len(matches) / min(len(kp1), len(kp2)))
            except ZeroDivisionError:
                score = 0
            print(image, score)
            img3 = cv2.drawMatchesKnn(img1, kp1, img_target_rotation, kp2, matches,
                                      None, flags=2)
            img3 = cv2.cvtColor(img3, cv2.COLOR_BGR2RGB)
            plt.imshow(img3)
            plt.show()
            plt.clf()


if __name__ == '__main__':
    compare_images_kaze()

这是我的代码的结果：

ex1.png 21.052631578947366
ex2.png 0.0
ex3.png 42.10526315789473

没问题！它能够判断 ex1 相似而 ex2 不相似，但是它指出 ex3 相似（甚至比 ex1 更相似）。是否有任何额外的预处理或后处理（可能是 ml，假设 ml 实际上有用）或者只是我可以对我的方法进行的更改以仅保持 ex1 相似而不是 ex3？

（请注意，我创建的这个分数是我在网上找到的。不确定这是否是一种准确的方法）

在下面添加了更多示例

另外一组例子：

这就是我要搜索的内容

我希望上面的图像与中间和底部的图像相似（注意：我将目标图像旋转 45 度并将其与下面的图像进行比较。）

特征匹配（如下面的答案所述）有助于发现与第二张图像的相似性，但不是第三张图像（即使在正确旋转后）

【问题讨论】：

标签： python opencv machine-learning image-processing image-recognition

【解决方案1】：

检测最相似的图像

代码

您可以使用template matching，其中您要检测它是否在其他图像中的图像是模板。我将那张小图片保存在template.png 中，其他三张图片保存在img1.png、img2.png 和img3.png 中。

我定义了一个函数，该函数利用cv2.matchTemplate 来计算模板是否在图像中的置信度。在每张图像上使用该函数，得到最高置信度的是包含模板的图像：

import cv2

template = cv2.imread("template.png", 0)
files = ["img1.png", "img2.png", "img3.png"]

for name in files:
    img = cv2.imread(name, 0)
    print(f"Confidence for {name}:")
    print(cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED).max())

输出：

Confidence for img1.png:
0.8906427
Confidence for img2.png:
0.4427919
Confidence for img3.png:
0.5933967

解释：

导入opencv模块，通过将cv2.imread方法的第二个参数设置为0，将模板图像作为灰度读入：

import cv2

template = cv2.imread("template.png", 0)

定义您要确定哪些图像包含模板的图像列表：

files = ["img1.png", "img2.png", "img3.png"]

遍历文件名并以灰度图像的形式读取每个文件名：

for name in files:
    img = cv2.imread(name, 0)

最后，您可以使用cv2.matchTemplate 来检测每个图像中的模板。有many detection methods 可以使用，但为此我决定使用cv2.TM_CCOEFF_NORMED 方法：

    print(f"Confidence for {name}:")
    print(cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED).max())

函数的输出范围在0和1之间，如你所见，它成功检测到第一个图像最有可能包含模板图像（它的最高级别为信心）。

可视化

代码

如果检测哪个图像包含模板还不够，并且您想要可视化，您可以尝试以下代码：

import cv2
import numpy as np

def confidence(img, template):
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
    res = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)
    conf = res.max()
    return np.where(res == conf), conf

files = ["img1.png", "img2.png", "img3.png"]

template = cv2.imread("template.png")
h, w, _ = template.shape

for name in files:
    img = cv2.imread(name)
    ([y], [x]), conf = confidence(img, template)
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)
    text = f'Confidence: {round(float(conf), 2)}'
    cv2.putText(img, text, (x, y), 1, cv2.FONT_HERSHEY_PLAIN, (0, 0, 0), 2)
    cv2.imshow(name, img)
    
cv2.imshow('Template', template)
cv2.waitKey(0)

输出：

解释：

导入必要的库：

import cv2
import numpy as np

定义一个接收完整图像和模板图像的函数。由于cv2.matchTemplate方法需要灰度图，将2张图转为灰度图：

def confidence(img, template):
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)

使用cv2.matchTemplate方法检测图像中的模板，返回置信度最高的点的位置，返回置信度最高的点：

    res = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)
    conf = res.max()
    return np.where(res == conf), conf

定义您要确定哪个包含模板的图像列表，并读入模板图像：

files = ["img1.png", "img2.png", "img3.png"]
template = cv2.imread("template.png")

获取模板图像的大小以供以后在图像上绘制矩形：

h, w, _ = template.shape

遍历文件名并读取每个图像。使用我们之前定义的confidence函数，得到检测到的模板左上角的xy位置和检测的置信度：

for name in files:
    img = cv2.imread(name)
    ([y], [x]), conf = confidence(img, template)

在图像的角上画一个矩形，然后将文本放在图像上。最后，展示图片：

    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)
    text = f'Confidence: {round(float(conf), 2)}'
    cv2.putText(img, text, (x, y), 1, cv2.FONT_HERSHEY_PLAIN, (0, 0, 0), 2)
    cv2.imshow(name, img)

另外，显示模板进行比较：

cv2.imshow('Template', template)
cv2.waitKey(0)

【讨论】：

感谢您的建议。我尝试使用模板匹配，它适用于这种情况，但它并不能完全适用于其他情况（如果您愿意，我可以发布示例）您还有其他建议吗？
@mike_gundy123 我可能还有其他建议。不过，我需要先查看其他示例。
好的，如果有帮助，添加另一组示例！
@mike_gundy123 谢谢，我添加了另一个答案。

【解决方案2】：

我不确定，如果给定的图像与您的实际任务或数据相似，但是对于这种图像，您可以尝试简单的模板匹配，参见。 this OpenCV tutorial.

基本上，我只是对教程进行了一些修改：

import cv2
import matplotlib.pyplot as plt

# Read images
examples = [cv2.imread(img) for img in ['ex1.png', 'ex2.png', 'ex3.png']]
target = cv2.imread('target.png')
h, w = target.shape[:2]

# Iterate examples
for i, img in enumerate(examples):

    # Template matching
    # cf. https://docs.opencv.org/4.5.2/d4/dc6/tutorial_py_template_matching.html
    res = cv2.matchTemplate(img, target, cv2.TM_CCOEFF_NORMED)

    # Get location of maximum
    _, max_val, _, top_left = cv2.minMaxLoc(res)

    # Set up threshold for decision target found or not
    thr = 0.7
    if max_val > thr:

        # Show found target in example
        bottom_right = (top_left[0] + w, top_left[1] + h)
        cv2.rectangle(img, top_left, bottom_right, (0, 255, 0), 2)

    # Visualization
    plt.figure(i, figsize=(10, 5))
    plt.subplot(1, 2, 1), plt.imshow(img[..., [2, 1, 0]]), plt.title('Example')
    plt.subplot(1, 2, 2), plt.imshow(res, vmin=0, vmax=1, cmap='gray')
    plt.title('Matching result'), plt.colorbar(), plt.tight_layout()

plt.show()

这些是结果：

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.16299-SP0
Python:        3.9.1
PyCharm:       2021.1.1
Matplotlib:    3.4.1
OpenCV:        4.5.1
----------------------------------------

编辑：为了强调来自不同颜色的信息，可以使用来自HSV color space 的色调通道进行模板匹配：

import cv2
import matplotlib.pyplot as plt

# Read images
examples = [
    [cv2.imread(img) for img in ['ex1.png', 'ex2.png', 'ex3.png']],
    [cv2.imread(img) for img in ['ex12.png', 'ex22.png', 'ex32.png']]
]
targets = [
    cv2.imread('target.png'),
    cv2.imread('target2.png')
]

# Iterate examples and targets
for i, (ex, target) in enumerate(zip(examples, targets)):
    for j, img in enumerate(ex):

        # Rotate last image from second data set
        if (i == 1) and (j == 2):
            img = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)

        h, w = target.shape[:2]

        # Get hue channel from HSV color space
        target_h = cv2.cvtColor(target, cv2.COLOR_BGR2HSV)[..., 0]
        img_h = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)[..., 0]

        # Template matching
        # cf. https://docs.opencv.org/4.5.2/d4/dc6/tutorial_py_template_matching.html
        res = cv2.matchTemplate(img_h, target_h, cv2.TM_CCOEFF_NORMED)

        # Get location of maximum
        _, max_val, _, top_left = cv2.minMaxLoc(res)

        # Set up threshold for decision target found or not
        thr = 0.6
        if max_val > thr:

            # Show found target in example
            bottom_right = (top_left[0] + w, top_left[1] + h)
            cv2.rectangle(img, top_left, bottom_right, (0, 255, 0), 2)

        # Visualization
        plt.figure(i * 10 + j, figsize=(10, 5))
        plt.subplot(1, 2, 1), plt.imshow(img[..., [2, 1, 0]]), plt.title('Example')
        plt.subplot(1, 2, 2), plt.imshow(res, vmin=0, vmax=1, cmap='gray')
        plt.title('Matching result'), plt.colorbar(), plt.tight_layout()
        plt.savefig('{}.png'.format(i * 10 + j))

plt.show()

新结果：

【讨论】：

嗯，我看到了，但不太确定它的效果如何。您认为有没有一种方法可以将这两种方法（Kaze 和模板匹配）结合在一起，以获得更强大的解决方案？
这里无法给出任何建议，因为我之前从未使用过本地特征检测。但是，您是否已经实现了模板匹配？你在你的数据集上测试过吗？您是否有证据表明模板匹配是不够的，或者您根本需要更强大的解决方案？显示的图像是您的实际数据集吗？
这些图像不是我的数据集，但它们是足够接近的表示。匹配模板也比我以前的 KAZE 效果更好，但不足以满足我的需要。这就是为什么我要问是否有办法以某种方式将两者结合起来并获得更好的解决方案
没有看到实际的数据集或至少一些玩具数据图像，一般很难给出建议。此外，还不清楚“不够好”是什么意思。准确率是否达到 55%？准确度是否达到 90 %，但您想要更多？一般来说，仅从数据集“属性”来看，有可能获得比 90% 更好的结果吗？所有这些问题只能由您来回答，因为您是唯一可以访问数据的人。当然，您可以将这两种解决方案结合起来，这称为集成分类器。如果这适合您的实际数据集？这里的人很难说。
是的，这是可以理解的。我不好。我可以添加我正在处理的图像类型的另一个示例，如果这可能有助于清除更多问题

【解决方案3】：

概念

我们可以使用cv2.matchTemplate 方法来检测图像在另一个图像中的位置，但是对于您的第二组图像，您需要旋转。此外，我们还需要考虑颜色。

cv2.matchTemplate 将接收一个图像、一个模板（另一个图像） 和一个模板检测方法，并将返回一个灰度数组，其中灰度数组中最亮的点将是该点最有信心的是模板在那个时候。

我们可以在 4 个不同的角度使用模板，并使用导致最高置信度的那个。当我们检测到与模板匹配的可能点时，我们使用一个函数（我们将自己定义） 来检查模板中最常见的颜色是否存在于图像的补丁中 我们检测到。如果不是，则忽略该补丁，无论返回的置信度如何。

代码

import cv2
import numpy as np

def frequent_colors(img, vals=3):
    colors, count = np.unique(np.vstack(img), return_counts=True, axis=0)
    sorted_by_freq = colors[np.argsort(count)]
    return sorted_by_freq[-vals:]

def get_templates(img):
    template = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    for i in range(3):
        yield cv2.rotate(template, i)
        
def detect(img, template, min_conf=0.45):
    colors = frequent_colors(template)
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    conf_max = min_conf
    shape = 0, 0, 0, 0
    for tmp in get_templates(template):
        h, w = tmp.shape
        res = cv2.matchTemplate(img_gray, tmp, cv2.TM_CCOEFF_NORMED)
        for y, x in zip(*np.where(res > conf_max)):
            conf = res[y, x]
            if conf > conf_max:
                seg = img[y:y + h, x:x + w]
                if all(np.any(np.all(seg == color, -1)) for color in colors):
                    conf_max = conf
                    shape = x, y, w, h
    return shape

files = ["img1_2.png", "img2_2.png", "img3_2.png"]
template = cv2.imread("template2.png")

for name in files:
    img = cv2.imread(name)
    x, y, w, h = detect(img, template)
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)
    cv2.imshow(name, img)

cv2.imshow('Template', template)
cv2.waitKey(0)

输出

解释

导入必要的库：

import cv2
import numpy as np

定义一个函数frequent_colors，它将接收图像并返回图像中最常见的颜色。一个可选参数，val，是返回多少颜色；如果val 是3，则返回最常见的 3 个颜色：

def frequent_colors(img, vals=3):
    colors, count = np.unique(np.vstack(img), return_counts=True, axis=0)
    sorted_by_freq = colors[np.argsort(count)]
    return sorted_by_freq[-vals:]

定义一个函数get_templates，它将接收图像，并以 4 个不同的角度生成图像（灰度） - 原始、顺时针 90、180 和逆时针 90：

def get_templates(img):
    template = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    for i in range(3):
        yield cv2.rotate(template, i)

定义一个函数detect，它将接收一个图像和一个模板图像，并返回图像上检测到的模板边界框的x、y、w、h，对于这个函数，我们将使用前面定义的frequent_colors 和get_templates 函数。 min_conf 参数是将检测分类为实际检测所需的最小置信度：

def detect(img, template, min_conf=0.45):

检测模板中最常见的三种颜色并将它们存储在变量colors 中。此外，定义主图像的灰度版本：

    colors = frequent_colors(template)
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

定义检测到的最大置信度的初始值，以及检测到的补丁的初始值：

    conf_max = min_conf
    shape = 0, 0, 0, 0

从4个角度循环灰度模板，得到灰度模板的形状（随着旋转改变形状），然后使用cv2.matchTemplate方法得到检测到的模板的灰度数组图片：

    for tmp in get_templates(template):
        h, w = tmp.shape
        res = cv2.matchTemplate(img_gray, tmp, cv2.TM_CCOEFF_NORMED)

遍历置信度大于conf_min 的检测模板的x、y 坐标，并将置信度存储在变量conf 中。如果conf 大于初始最大置信度变量(conf_max)，则继续检测模板中最常见的三种颜色是否都存在于图像的补丁中：

        for y, x in zip(*np.where(res > conf_max)):
            conf = res[y, x]
            if conf > conf_max:
                seg = img[y:y + h, x:x + w]
                if all(np.any(np.all(seg == color, -1)) for color in colors):
                    conf_max = conf
                    shape = x, y, w, h

最后我们可以返回形状。如果图像中没有检测到模板，形状将是为其定义的初始值，0, 0, 0, 0：

    return shape

最后，循环遍历每个图像并使用我们定义的detect 函数来获取边界框的x、y、w、h。使用cv2.rectangle 方法在图像上绘制边界框：

files = ["img1_2.png", "img2_2.png", "img3_2.png"]
template = cv2.imread("template2.png")

for name in files:
    img = cv2.imread(name)
    x, y, w, h = detect(img, template)
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 0, 255), 2)
    cv2.imshow(name, img)

cv2.imshow('Template', template)
cv2.waitKey(0)

【讨论】：

【解决方案4】：

首先，数据出现在图表中，你不能从它们的数值数据中得到重叠值吗？

您是否尝试过对颜色从白蓝到蓝红的变化执行一些边缘检测，将一些圆圈拟合到这些边缘，然后检查它们是否重叠？

由于输入数据受到严格控制（没有有机照片或视频），也许您不必走 ML 路线。

【讨论】：

对不起，第一部分是什么意思？我可以尝试用这个做边缘检测。我的主要问题是如何将边缘检测纳入某种“相似性”分数
我评论说数据是在带有轴的图表上呈现的。 (0:300, 0:400)。该数据以某种方式绘制。为什么要使用 cv2 来确定重叠，而不是处理用于生成图表的基础数值数据？
是的，原始数据（关键点和描述符）在 calculate_matches() 中使用，并且图表是查看 KAZE 算法正在考虑的“相似”（如果有意义的话）的可视化方式跨度>