对一对向量进行排序答案

【问题标题】：Sorting a pair of vectors对一对向量进行排序
【发布时间】：2015-01-18 20:57:55
【问题描述】：

我知道如何对向量对进行排序，但是如何对向量对进行排序呢？我可以考虑在一对向量上编写一个自定义的“虚拟”迭代器并对其进行排序，但这似乎很复杂。有没有更简单的方法？ C ++ 03中有一个吗？我想使用std::sort。

在处理一些硬件生成的数据时会出现这个问题，其中一对数组比对数组更有意义（从那时起就会出现各种跨步和对齐问题）。我意识到，否则保留一对向量而不是一对向量将是一个设计缺陷（数组结构问题）。我正在寻找一种快速的解决方案，将数据复制到成对的向量然后返回（我会将其返回给硬件进行更多处理）不是一种选择。

例子：

keys   = {5, 2, 3, 1, 4}
values = {a, b, d, e, c}

排序后（按第一个向量）：

keys   = {1, 2, 3, 4, 5}
values = {e, b, d, c, a}

我将“向量对”称为keys 和values 对（存储为例如std::pair<std::vector<size_t>, std::vector<double> >）。向量具有相同的长度。

【问题讨论】：

@Daniel 我认为这意味着根据其中一个向量的值对两个向量进行排序，以便它们之间的配对保持相同。
@caps 不，不是独立的。输出应该与制作std::pair 的向量相同，对其进行排序，然后再次拆分。请注意，它是按第一个向量、第二个向量还是两者排序，可能并不重要，因为这是比较运算符的任务。
@FredLarson 这两个都可以，并且是我通常最终使用的，但你会认为会有一个更优雅的解决方案，不需要分配另一个完整的整数集.. .a 可写的 zip 迭代器可以解决问题，但我认为你必须自己动手。
我认为没有使用std::sort 的选项。如果您坚持对值进行就地排序，我认为您必须实现自己的排序算法。 std::sort 会让你插入比较，但不能插入交换操作。
@IdeaHat 如果std::sort 为交换和比较暴露了一个函子，那将是微不足道的。不幸的是，它没有。我想知道复制 std::sort 的库实现并进行一项更改有多难？

标签： c++ sorting vector

【解决方案1】：

让我们创建一个排序/置换迭代器，这样我们就可以说：

int  keys[] = {   5,   2,   3,   1,   4 };
char vals[] = { 'a', 'b', 'd', 'e', 'c' };

std::sort(make_dual_iter(begin(keys), begin(vals)), 
          make_dual_iter(end(keys), end(vals)));

// output
std::copy(begin(keys), end(keys), std::ostream_iterator<int> (std::cout << "\nKeys:\t",   "\t"));
std::copy(begin(vals), end(vals), std::ostream_iterator<char>(std::cout << "\nValues:\t", "\t"));

看到 Live On Coliru，正在打印

Keys:   1   2   3   4   5   
Values: e   b   d   c   a

基于here的想法，我实现了这个：

namespace detail {
    template <class KI, class VI> struct helper { 
        using value_type = boost::tuple<typename std::iterator_traits<KI>::value_type, typename std::iterator_traits<VI>::value_type>;
        using ref_type   = boost::tuple<typename std::iterator_traits<KI>::reference,  typename std::iterator_traits<VI>::reference>; 

        using difference_type = typename std::iterator_traits<KI>::difference_type;
    };
}

template <typename KI, typename VI, typename H = typename detail::helper<KI, VI> > 
class dual_iter : public boost::iterator_facade<dual_iter<KI, VI>, // CRTP
    typename H::value_type, std::random_access_iterator_tag, typename H::ref_type, typename H::difference_type> 
{ 
public: 
    dual_iter() = default;
    dual_iter(KI ki, VI vi) : _ki(ki), _vi(vi) { } 

    KI _ki; 
    VI _vi; 

private: 
    friend class boost::iterator_core_access; 

    void increment() { ++_ki; ++_vi; } 
    void decrement() { --_ki; --_vi; } 

    bool equal(dual_iter const& other) const { return (_ki == other._ki); } 

    typename detail::helper<KI, VI>::ref_type dereference() const { 
        return (typename detail::helper<KI, VI>::ref_type(*_ki, *_vi)); 
    } 

    void advance(typename H::difference_type n) { _ki += n; _vi += n; } 
    typename H::difference_type distance_to(dual_iter const& other) const { return ( other._ki - _ki); } 
};

现在工厂函数很简单：

template <class KI, class VI> 
    dual_iter<KI, VI> make_dual_iter(KI ki, VI vi) { return {ki, vi}; }

注意使用boost/tuples/tuple_comparison.hpp 进行排序有点懒惰。当多个键值共享同一个值时，这可能对稳定排序造成问题。但是，在这种情况下，无论如何都很难定义什么是“稳定”排序，所以我认为目前并不重要。

完整列表

Live On Coliru

#include <boost/iterator/iterator_adaptor.hpp>
#include <boost/tuple/tuple_comparison.hpp>

namespace boost { namespace tuples {

    // MSVC might not require this
    template <typename T, typename U>
    inline void swap(boost::tuple<T&, U&> a, boost::tuple<T&, U&> b) noexcept {
        using std::swap;
        swap(boost::get<0>(a), boost::get<0>(b));
        swap(boost::get<1>(a), boost::get<1>(b));
    }

} }

namespace detail {
    template <class KI, class VI> struct helper { 
        using value_type = boost::tuple<typename std::iterator_traits<KI>::value_type, typename std::iterator_traits<VI>::value_type>;
        using ref_type   = boost::tuple<typename std::iterator_traits<KI>::reference,  typename std::iterator_traits<VI>::reference>; 

        using difference_type = typename std::iterator_traits<KI>::difference_type;
    };
}

template <typename KI, typename VI, typename H = typename detail::helper<KI, VI> > 
class dual_iter : public boost::iterator_facade<dual_iter<KI, VI>, // CRTP
    typename H::value_type, std::random_access_iterator_tag, typename H::ref_type, typename H::difference_type> 
{ 
public: 
    dual_iter() = default;
    dual_iter(KI ki, VI vi) : _ki(ki), _vi(vi) { } 

    KI _ki; 
    VI _vi; 

private: 
    friend class boost::iterator_core_access; 

    void increment() { ++_ki; ++_vi; } 
    void decrement() { --_ki; --_vi; } 

    bool equal(dual_iter const& other) const { return (_ki == other._ki); } 

    typename detail::helper<KI, VI>::ref_type dereference() const { 
        return (typename detail::helper<KI, VI>::ref_type(*_ki, *_vi)); 
    } 

    void advance(typename H::difference_type n) { _ki += n; _vi += n; } 
    typename H::difference_type distance_to(dual_iter const& other) const { return ( other._ki - _ki); } 
}; 

template <class KI, class VI> 
    dual_iter<KI, VI> make_dual_iter(KI ki, VI vi) { return {ki, vi}; }

#include <iostream>
using std::begin;
using std::end;

int main()
{
    int  keys[] = {   5,   2,   3,   1,   4 };
    char vals[] = { 'a', 'b', 'd', 'e', 'c' };

    std::sort(make_dual_iter(begin(keys), begin(vals)), 
              make_dual_iter(end(keys), end(vals)));

    std::copy(begin(keys), end(keys), std::ostream_iterator<int> (std::cout << "\nKeys:\t",   "\t"));
    std::copy(begin(vals), end(vals), std::ostream_iterator<char>(std::cout << "\nValues:\t", "\t"));
}

【讨论】：

使用 Boost 相当简洁。让我看看我是否明白了，boost::iterator_facade 为您实现了所有迭代器函数，仅基于 increment、decrement、equal、dereference、advance 和 distance_to？如果是这样，这非常有用，从我的无升压双迭代器的长度来看。比较不是问题，我可以轻松地为std::sort编写自己的比较谓词。
我会在所有帐户上说“确实”:)
敢写dual_iter的可变参数模板版吗？
@dalle sure:coliru.stacked-crooked.com/a/95de0569a933fde5（我用过c++14所以我不用写[make_]index_sequence<>。我认为如果你想真正使用还有很多机会进一步简化c++14)

【解决方案2】：

只是为了比较，这是拆分迭代器方法需要多少代码：

template <class V0, class V1>
class CRefPair { // overrides copy semantics of std::pair
protected:
    V0 &m_v0;
    V1 &m_v1;

public:
    CRefPair(V0 &v0, V1 &v1)
        :m_v0(v0), m_v1(v1)
    {}

    void swap(CRefPair &other)
    {
        std::swap(m_v0, other.m_v0);
        std::swap(m_v1, other.m_v1);
    }

    operator std::pair<V0, V1>() const // both g++ and msvc sort requires this (to get a pivot)
    {
        return std::pair<V0, V1>(m_v0, m_v1);
    }

    CRefPair &operator =(std::pair<V0, V1> v) // both g++ and msvc sort requires this (for insertion sort)
    {
        m_v0 = v.first;
        m_v1 = v.second;
        return *this;
    }

    CRefPair &operator =(const CRefPair &other) // required by g++ (for _GLIBCXX_MOVE)
    {
        m_v0 = other.m_v0;
        m_v1 = other.m_v1;
        return *this;
    }
};

template <class V0, class V1>
inline bool operator <(std::pair<V0, V1> a, CRefPair<V0, V1> b) // required by both g++ and msvc
{
    return a < std::pair<V0, V1>(b); // default pairwise lexicographical comparison
}

template <class V0, class V1>
inline bool operator <(CRefPair<V0, V1> a, std::pair<V0, V1> b) // required by both g++ and msvc
{
    return std::pair<V0, V1>(a) < b; // default pairwise lexicographical comparison
}

template <class V0, class V1>
inline bool operator <(CRefPair<V0, V1> a, CRefPair<V0, V1> b) // required by both g++ and msvc
{
    return std::pair<V0, V1>(a) < std::pair<V0, V1>(b); // default pairwise lexicographical comparison
}

namespace std {

template <class V0, class V1>
inline void swap(CRefPair<V0, V1> &a, CRefPair<V0, V1> &b)
{
    a.swap(b);
}

} // ~std

template <class It0, class It1>
class CPairIterator : public std::random_access_iterator_tag {
public:
    typedef typename std::iterator_traits<It0>::value_type value_type0;
    typedef typename std::iterator_traits<It1>::value_type value_type1;
    typedef std::pair<value_type0, value_type1> value_type;
    typedef typename std::iterator_traits<It0>::difference_type difference_type;
    typedef /*typename std::iterator_traits<It0>::distance_type*/difference_type distance_type; // no distance_type in g++, only in msvc
    typedef typename std::iterator_traits<It0>::iterator_category iterator_category;
    typedef CRefPair<value_type0, value_type1> reference;
    typedef reference *pointer; // not so sure about this, probably can't be implemented in a meaningful way, won't be able to overload ->
    // keep the iterator traits happy

protected:
    It0 m_it0;
    It1 m_it1;

public:
    CPairIterator(const CPairIterator &r_other)
        :m_it0(r_other.m_it0), m_it1(r_other.m_it1)
    {}

    CPairIterator(It0 it0 = It0(), It1 it1 = It1())
        :m_it0(it0), m_it1(it1)
    {}

    reference operator *()
    {
        return reference(*m_it0, *m_it1);
    }

    value_type operator *() const
    {
        return value_type(*m_it0, *m_it1);
    }

    difference_type operator -(const CPairIterator &other) const
    {
        assert(m_it0 - other.m_it0 == m_it1 - other.m_it1);
        // the iterators always need to have the same position
        // (incomplete check but the best we can do without having also begin / end in either vector)

        return m_it0 - other.m_it0;
    }

    bool operator ==(const CPairIterator &other) const
    {
        assert(m_it0 - other.m_it0 == m_it1 - other.m_it1);
        return m_it0 == other.m_it0;
    }

    bool operator !=(const CPairIterator &other) const
    {
        return !(*this == other);
    }

    bool operator <(const CPairIterator &other) const
    {
        assert(m_it0 - other.m_it0 == m_it1 - other.m_it1);
        return m_it0 < other.m_it0;
    }

    bool operator >=(const CPairIterator &other) const
    {
        return !(*this < other);
    }

    bool operator <=(const CPairIterator &other) const
    {
        return !(other < *this);
    }

    bool operator >(const CPairIterator &other) const
    {
        return other < *this;
    }

    CPairIterator operator +(distance_type d) const
    {
        return CPairIterator(m_it0 + d, m_it1 + d);
    }

    CPairIterator operator -(distance_type d) const
    {
        return *this + -d;
    }

    CPairIterator &operator +=(distance_type d)
    {
        return *this = *this + d;
    }

    CPairIterator &operator -=(distance_type d)
    {
        return *this = *this + -d;
    }

    CPairIterator &operator ++()
    {
        return *this += 1;
    }

    CPairIterator &operator --()
    {
        return *this += -1;
    }

    CPairIterator operator ++(int) // msvc sort actually needs this, g++ does not
    {
        CPairIterator old = *this;
        ++ (*this);
        return old;
    }

    CPairIterator operator --(int)
    {
        CPairIterator old = *this;
        -- (*this);
        return old;
    }
};

template <class It0, class It1>
inline CPairIterator<It0, It1> make_pair_iterator(It0 it0, It1 it1)
{
    return CPairIterator<It0, It1>(it0, it1);
}

它的边缘有点粗糙，也许我只是不擅长重载比较，但是支持std::sort 的不同实现所需的差异量让我认为这个骇人听闻的解决方案实际上可能更便携。但是排序要好得多：

struct CompareByFirst {
    bool operator ()(std::pair<size_t, char> a, std::pair<size_t, char> b) const
    {
        return a.first < b.first;
    }
};

std::vector<char> vv; // filled by values
std::vector<size_t> kv; // filled by keys

std::sort(make_pair_iterator(kv.begin(), vv.begin()),
    make_pair_iterator(kv.end(), vv.end()), CompareByFirst());
// nice

当然它会给出the correct result。

【讨论】：

【解决方案3】：

受 Mark Ransom 评论的启发，这是一个可怕的 hack，以及如何不这样做的示例。我写它只是为了消遣，因为我想知道它会变得多么复杂。这不是我问题的答案，我不会使用它。我只是想分享一个奇怪的想法。请不要投反对票。

其实忽略多线程，我相信这是可以做到的：

template <class KeyType, class ValueVectorType>
struct MyKeyWrapper { // all is public to save getters
    KeyType k;
    bool operator <(const MyKeyWrapper &other) const { return k < other.k; }
};

template <class KeyType, class ValueVectorType>
struct ValueVectorSingleton { // all is public to save getters, but kv and vv should be only accessible by getters
    static std::vector<MyKeyWrapper<KeyType, ValueVectorType> > *kv;
    static ValueVectorType *vv;

    static void StartSort(std::vector<MyKeyWrapper<KeyType, ValueVectorType> > &_kv, ValueVectorType &_vv)
    {
        assert(!kv && !vv); // can't sort two at once (if multithreading)
        assert(_kv.size() == _vv.size());
        kv = &_kv, vv = &_vv; // not an attempt of an atomic operation
    }

    static void EndSort()
    {
        kv = 0, vv = 0; // not an attempt of an atomic operation
    }
};

template <class KeyType, class ValueVectorType>
std::vector<MyKeyWrapper<KeyType, ValueVectorType> >
    *ValueVectorSingleton<KeyType, ValueVectorType>::kv = 0;
template <class KeyType, class ValueVectorType>
ValueVectorType *ValueVectorSingleton<KeyType, ValueVectorType>::vv = 0;

namespace std {

template <class KeyType, class ValueVectorType>
void swap(MyKeyWrapper<KeyType, ValueVectorType> &a,
    MyKeyWrapper<KeyType, ValueVectorType> &b)
{
    assert((ValueVectorSingleton<KeyType, ValueVectorType>::vv &&
        ValueVectorSingleton<KeyType, ValueVectorType>::kv)); // if this triggers, someone forgot to call StartSort()
    ValueVectorType &vv = *ValueVectorSingleton<KeyType, ValueVectorType>::vv;
    std::vector<MyKeyWrapper<KeyType, ValueVectorType> > &kv =
        *ValueVectorSingleton<KeyType, ValueVectorType>::kv;
    size_t ai = &kv.front() - &a, bi = &kv.front() - &b; // get indices in key vector
    std::swap(a, b); // swap keys
    std::swap(vv[ai], vv[bi]); // and any associated values
}

} // ~std

排序为：

std::vector<char> vv; // filled by values
std::vector<MyKeyWrapper<size_t, std::vector<char> > > kv; // filled by keys, casted to MyKeyWrapper

ValueVectorSingleton<size_t, std::vector<char> >::StartSort(kv, vv);
std::sort(kv.begin(), kv.end());
ValueVectorSingleton<size_t, std::vector<char> >::EndSort();
// trick std::sort into using the custom std::swap which also swaps the other vectors

这显然非常令人震惊，以可怕的方式滥用是微不足道的，但可以说比这对迭代器短得多，并且性能可能相似。还有actually works。

请注意，swap() 可以在 ValueVectorSingleton 内部实现，而注入 std 命名空间的那个只会调用它。这将避免将vv 和kv 公开。此外，还可以进一步检查a 和b 的地址，以确保它们在kv 内，而不是其他一些向量。此外，这仅限于仅按一个向量的值排序（不能同时按两个向量中的对应值排序）。并且模板参数可以是简单的KeyType和ValueType，这个写的很匆忙。

【讨论】：

谢谢。我想我很高兴你害怕（就像恐怖电影制作者可能会因为取悦观众而感到高兴）。

【解决方案4】：

这是我曾经用来将数组与索引数组一起排序的解决方案（--也许它来自这里的某个地方？）：

template <class iterator>
class IndexComparison
{
public:
    IndexComparison (iterator const& _begin, iterator const& _end) :
      begin (_begin),
      end (_end)
    {}

    bool operator()(size_t a, size_t b) const
    {
        return *std::next(begin,a) < *std::next(begin,b);
    }

private:
    const iterator begin;
    const iterator end;
};

用法：

std::vector<int> values{5,2,5,1,9};
std::vector<size_t> indices(values.size());
std::iota(indices.begin(),indices.end(),0);

std::sort(indices.begin(),indices.end()
        , IndexComparison<decltype(values.cbegin())>(values.cbegin(),values.cend()));

然后，向量indices 中的整数被置换，使得它们对应于向量values 中增加的值。很容易将其从较少比较扩展到一般比较函数。

接下来，为了也对值进行排序，您可以再做一个

std::sort(values.begin(),values.end());

使用相同的比较函数。这是懒人的解决方案。当然，您也可以根据

auto temp=values;
for(size_t i=0;i<indices.size();++i)
{
     values[i]=temp[indices[i]];
}

DEMO

编辑：我刚刚意识到上述排序与您要求的方向相反。

【讨论】：

不错。虽然请注意，您的演示给出的结果与上面的示例不同，但我想您是按另一个向量排序（没关系，它证明它有效）。
不用改，比较运算符随时可以改。它很好地证明了这个概念。 cmets中也提到了类似的方法。问题是这将需要一个临时数组来重新排列第二个向量。在我的情况下内存很短，它实际上是用于核外算法，我希望能够一次处理尽可能多的数据在 RAM 中。
不需要复制：只需使用即时找到的索引，即使用values[indices[i]]。此外，您还有另一种微不足道的解决方案。请注意第 8.4 章。的Numerical Recipes 包含一些关于排名和索引的信息，第三版还包含一个可以使用的 C++ 类。
去睡觉了，我的欧洲已经凌晨 1 点了。由于我不想接受我的任何一个答案，如果明天没有更好的答案，我会接受你的。感谢您的努力。
排序后，我需要将数据发送回HW，因为它需要处于最终顺序。有一种算法可以将排列转换为可以就地执行的排列，但那是额外的 O(n) 或更多，不记得准确了。