如何删除向量的重复坐标<Point2f>答案

【问题标题】：how to delete repeating coordinates of vector<Point2f>如何删除向量的重复坐标<Point2f>
【发布时间】：2014-10-01 14:20:13
【问题描述】：

我将点的坐标传递给向量，并且有一些重复点，所以我想删除其他重复点，只保留唯一的点。

例如：

vector<Point2f>  points;

points[0]=Point2f(1,1);
points[1]=Point2f(2,3);
points[2]=Point2f(1,1);
points[3]=Point2f(2,3);
points[4]=Point2f(1,1);
points[5]=Point2f(4,1);

我想得到这样的结果：

points[0]=Point2f(1,1);
points[1]=Point2f(2,3);
points[2]=Point2f(4,1);

PS 元素顺序不变。

我尝试过的显示如下：

#include <opencv2/core/core.hpp>

#include <vector>
#include<iostream>

using namespace std;
using namespace cv;

int main()
{
    vector<Point2f>  pointTemp;

    pointTemp[0]=Point2f(1,1);
    pointTemp[1]=Point2f(2,3);
    pointTemp[2]=Point2f(1,1);
    pointTemp[3]=Point2f(2,3);
    pointTemp[4]=Point2f(1,1);
    pointTemp[5]=Point2f(4,1);

    for(vector<Point2f>::iterator it=pointTemp.begin();it!=pointTemp.end();it++)
    {
        for(vector<Point2f>::iterator it1=it+1;it1!=pointTemp.end();)
        {
            if(it->x==it1->x&&it->y==it1->y)
            {
                it1=pointTemp.erase(it1);
            }
            else
            {
                it1++;
            }
        }
    }
    //cout<<pointTemp.size()<<endl;

    return 0;
}

【问题讨论】：

能否请您发布您尝试过的内容？
为Point2f 类定义== 和< 运算符，并在标准算法库的帮助下使用sort-unique-erase 成语。
@user657267 我就是这样做的。但是：== 会有意义 < 没那么多。所以我会使用自定义谓词进行排序。
@juanchopanza 如果你想sort 某些东西，那么根据定义，它需要是弱可排序的，至少在概念上是这样。定义 operator< 会在其他地方引起问题吗？
@ChrisWebb 我已经发布了我的代码。

标签： c++ opencv vector point

【解决方案1】：

#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>

struct Point2f
{
    float x;
    float y;
};

int main(int argc, char const *argv[])
{
    std::vector<Point2f> points =
    {
        {1, 1}, {2, 3}, {1, 1}, {2, 3}, {1, 1}, {4, 1}
    };

    auto print = [&]()
    {
        for (const auto &point : points)
        {
            std::cout << "(" << point.x << " " << point.y << ") ";
        }
        std::cout << std::endl;
    };

    // first sort
    std::sort(points.begin(), points.end(), [](const Point2f & lhs, const Point2f & rhs)
    {
        return lhs.x < rhs.x && lhs.y < rhs.y;
    });

    // print
    print();

    // remove duplicated element
    auto it = std::unique(points.begin(), points.end(), [](const Point2f & lhs, const Point2f & rhs)
    {
        return lhs.x == rhs.x && lhs.y == rhs.y;
    });
    points.resize(std::distance(points.begin(), it));

    // print
    print();

    return 0;
}

【讨论】：

来自要求：“（剩余）元素的顺序不变” - sort 改变了这一点。
我的 Code::Blocks 向我展示了很多错误，例如“错误：在 C++98 中，'points' 必须由构造函数初始化，而不是由 '{...}'|”。
std::sort 需要严格的弱排序。您没有使用一个，导致未定义的行为。

【解决方案2】：

这是我的破解之道。它可能要求您将 --std=c++11 作为参数传递给 g++。请注意，唯一元素的插入顺序保持不变。对于运行时复杂性，它也是 O(N)。

// remove_duplicates: removes all duplicated elements from the vector passed in
void remove_duplicates(std::vector<Point2f>& vec)
{
    std::unordered_set<Point2f> pointset;  // unordered_set is a hash table implementation

    auto itor = vec.begin();
    while (itor != vec.end())
    {
        if (pointset.find(*itor) != pointset.end())   // O(1) lookup time for unordered_set
        {
            itor = vec.erase(itor); // vec.erase returns the next valid iterator
        }
        else
        {
            pointset.insert(*itor);
            itor++;
        }
    }
}

由于使用unordered_set，上述函数需要先前已为 Point2f 声明的散列函数。您可以随意定义它。我的简单实现如下。

您可能还需要为 Point2f 定义一个 == 运算符以及适当的构造函数以满足向量和 unordered_set 语义。

完整的代码清单：

#include <vector>
#include <unordered_set>


struct Point2f
{
    float x;
    float y;
    Point2f(float a, float b) : x(a), y(b) {}
    Point2f() : x(0), y(0) {}
};

bool operator==(const Point2f& pt1, const Point2f& pt2)
{
    return ((pt1.x == pt2.x) && (pt1.y == pt2.y));
}

namespace std
{
    template<>
    struct hash<Point2f>
    {
        size_t operator()(Point2f const& pt) const
        {
            return (size_t)(pt.x*100 + pt.y);
        }
    };
}


void removedupes(std::vector<Point2f> & vec)
{
    std::unordered_set<Point2f> pointset;

    auto itor = vec.begin();
    while (itor != vec.end())
    {
        if (pointset.find(*itor) != pointset.end())
        {
            itor = vec.erase(itor);
        }
        else
        {
            pointset.insert(*itor);
            itor++;
        }
    }
}


int main(int argc, char* argv[])
{
    std::vector<Point2f>  pointTemp;

    pointTemp.resize(6);

    pointTemp[0]=Point2f(1,1);
    pointTemp[1]=Point2f(2,3);
    pointTemp[2]=Point2f(1,1);
    pointTemp[3]=Point2f(2,3);
    pointTemp[4]=Point2f(1,1);
    pointTemp[5]=Point2f(4,1);

    removedupes(pointTemp);

    return 0;
}

【讨论】：

感谢您的出色解决方案。我有点难以理解你的代码。编程是一门艺术。
难以理解？我尽可能清楚地写了它。只是 C++ 和 std 集合类产生的代码难以阅读。有时这就是 C++ 的本质。如果您对上述代码有具体问题，我很乐意为您解释。
你的代码很清楚，但是我是C++的新手，所以对我来说有点困难。我会尽我所能去理解他们。无论如何，这对我来说是一个学习新东西的机会。再次感谢您。

【解决方案3】：

这可以通过首先对点进行排序（使用 std::sort）然后消除重复的点（使用 std::unique）来完成。为此，您需要一个函数 compare()

#include <algorithm>

// Lexicographic compare, same as for ordering words in a dictionnary:
// test first 'letter of the word' (x coordinate), if same, test 
// second 'letter' (y coordinate).
bool lexico_compare(const Point2f& p1, const Point2f& p2) {
    if(p1.x < p2.x) { return true; }
    if(p1.x > p2.x) { return false; }
    return (p1.y < p2.y);
}


 bool points_are_equal(const Point2f& p1, const Point2f& p2) {
   return ((p1.x == p2.x) && (p1.y == p2.y));
 }

void remove_duplicates(std::vector<Point2f>& points) {
    // Note: std::unique leaves a 'queue' of duplicated elements
    // at the end of the vector, and returns an iterator that indicates
    // where to stop (and where to 'erase' the queue)
    std::sort(points.begin(), points.end(), lexico_compare);
    points.erase(std::unique(points.begin(), points.end(), points_are_equal), points.end());
}

注意 1：您可以通过使用 C++0x11 lambdas 而不是两个函数 lexico_compare 和 points_are_equal 来缩短代码。

注意2：如果你需要保持点的顺序，你可以做一个间接排序，并跟踪哪些点是重复的。

【讨论】：

【解决方案4】：

请注意equal function。如果我们的目标是勾选similar 足够多的点，我们应该使用为similar 点提供相同散列值的散列和将similar points 分组为相同的approximate-equal。就我而言，我使用以下内容：

# include <iostream>
# include <vector>
# include <unordered_set>
# include <utility>

# include <Eigen/Dense>


const std::string red("\033[0;31m");
const std::string green("\033[1;32m");
const std::string yellow("\033[1;33m");
const std::string cyan("\033[0;36m");
const std::string magenta("\033[0;35m");
const std::string reset("\033[0m");

struct ApproxHash 
{
 std::size_t operator() (Eigen::Vector2d const& pt) const
 {


   size_t score = (size_t)(pt.x()*100) + (size_t)(pt.y()*10);
   std::cerr <<"Point: "<< pt.transpose()<< " has score: "<<score<<std::endl;
   return score; 
 }

};

struct ApproxEqual{
    // This is used to guarantee that no duplicates should happen when the hash collision happens. 
public:
 bool operator()(const Eigen::Vector2d & pt1, const Eigen::Vector2d & pt2) const {
    double threshold = 0.00001;
    bool xdiff = fabs(pt1.x() - pt2.x())<threshold;
    bool ydiff = fabs(pt1.y() - pt2.y())<threshold;

    bool result = (fabs(pt1.x() - pt2.x())<threshold) && (fabs(pt1.y() - pt2.y())<threshold);

    std::cerr<<cyan<<"Equal is called for: "<< pt1.transpose()<<" and "<<pt2.transpose()<<" which are " << result<<" equal. "<<" xdiff"<< xdiff<<", ydiff"<<ydiff<<reset<<std::endl;
    return result; 
}
};

void removeDuplicates(std::vector<Eigen::Vector2d>& vec)
{

    // If we would like to store values, we should use std::unordered_map.
    std::unordered_set<Eigen::Vector2d, ApproxHash, ApproxEqual> pointset;  

    auto ii = vec.begin();
    while (ii != vec.end())
    {
    std::cerr<<"Processing: "<<ii->transpose()<<std::endl;
        if (pointset.find(*ii) != pointset.end())   // O(1) lookup time for unordered_set
        {

        std::cerr<<red<<"Found duplicate: "<<ii->transpose()<<reset<<std::endl;
            vec.erase(ii); // vec.erase returns the next valid iterator
        }
        else
        {
            pointset.insert(*ii);
        std::cerr<<"Inserted: "<<ii->transpose()<<std::endl;
        ii++;
        }
    }
} // end of removeDuplicates

int main(int argc, char* argv[])
{


    std::vector<Eigen::Vector2d>  pointTemp;

    pointTemp.resize(15);
    pointTemp[0]=Eigen::Vector2d(1.0011121213,1);
    pointTemp[1]=Eigen::Vector2d(2.0,3.121);
    pointTemp[2]=Eigen::Vector2d(4.004,1.0);
    pointTemp[3]=Eigen::Vector2d(2.0,3.121);
    pointTemp[4]=Eigen::Vector2d(1.001112121,1);
    pointTemp[5]=Eigen::Vector2d(4.004,1.0);
    pointTemp[6]=Eigen::Vector2d(1.2,1);
    pointTemp[7]=Eigen::Vector2d(0.028297902,  0.302034);
    pointTemp[8]=Eigen::Vector2d(0.028297901,  0.302034);
    pointTemp[9]=Eigen::Vector2d(0.249941, 0.227669);
    pointTemp[10]=Eigen::Vector2d(0.249941, 0.227669);
    pointTemp[11]=Eigen::Vector2d(0.0206403,  0.304258);
    pointTemp[12]=Eigen::Vector2d(0.0206403,  0.304258);
    pointTemp[13]=Eigen::Vector2d(0.0206403,  0.304258);
    pointTemp[14]=Eigen::Vector2d(0.0282979,  0.302034);

    for (auto & point:pointTemp)
    {
      std::cout<<point.x()<<", "<< point.y()<<std::endl;
    }

    removeDuplicates(pointTemp);
    std::cerr<<green<<"Cleaned vector: "<<reset<<std::endl;

    for (auto & point:pointTemp)
    {
      std::cout<<point.x()<<", "<< point.y()<<std::endl;
    }

    return 0;
}

我们可以使用g++ -std=c++11 -I /usr/include vectorHash.cpp -o vectorHash 来编译示例。

如果我们使用exact equal 或exact hash，那么很遗憾我们无法接收similar points。

【讨论】：