【问题标题】:Sort by proxy (or: sort one container by the contents of another) in C++在 C++ 中按代理排序(或:按另一个容器的内容对一个容器进行排序)
【发布时间】:2010-08-03 16:57:44
【问题描述】:

我有一组数据被分成两个数组(我们称它们为datakeys)。也就是说,对于索引为i 的任何给定项目,我可以使用data[i] 访问该项目的数据,并使用keys[i] 访问该项目的密钥。我无法更改此结构(例如,将键和数据交错到单个数组中),因为我需要将 data 数组传递给需要特定数据布局的库函数。

如何根据keys 数组的内容对两个数组进行排序(最好使用标准库函数)?

【问题讨论】:

    标签: c++ sorting stl


    【解决方案1】:

    创建一个包含两个数组索引的对象向量。为该对象定义operator<,以根据keys[index] 进行比较。对该向量进行排序。完成后,遍历该向量并将原始对象按照这些代理对象定义的顺序排列:

    // warning: untested code.
    struct sort_proxy { 
        size_t i;
    
        bool operator<(sort_proxy const &other) const { 
            return keys[i] < keys[other.i];
        }
    };
    
    // ... key_size = number of keys/data items.
    std::vector<sort_proxy> proxies(key_size); 
    
    for (i=0; i<keys_size; i++)
        proxies[i].i = i;
    
    std::sort(proxies.begin(), proxies.end());
    
    // in-place reordering left as an exercise for the reader. :-)
    for (int i=0; i<proxies[i].size(); i++) {
        result_keys[i] = keys[proxies[i].i];
        result_data[i] = data[proxies[i].i];
    }
    

    【讨论】:

      【解决方案2】:

      这是一个示例实现,它定义了一个新的迭代器类型,以提供两个序列的配对视图。我试图使它符合标准且正确,但由于 C++ 标准的细节极其复杂,我几乎可以肯定我失败了。 我会说这段代码在使用clang++g++ 构建时似乎可以工作。

      此代码不推荐用于一般用途,因为它比其他答案更长且更难理解,并且可能会调用可怕的“未定义行为”。

      然而,它确实具有恒定时间和空间开销的优势,因为它提供了现有数据的视图,而不是实际构建临时替代表示或排列向量。这段代码最明显的(对我来说)性能问题是两个容器的各个元素必须在交换操作期间被复制。尽管进行了多次尝试,但我还没有找到成功专门化 std::swap 的方法,这样 std::sortstd::random_shuffle 将避免使用默认的基于临时副本的交换实现。使用 C++0x 右值参考系统(参见 std::move 和 Jon Purdy 的回答)可能会解决这个问题。

      #ifndef HDR_PAIRED_ITERATOR
      #define HDR_PAIRED_ITERATOR
      
      #include <iterator>
      
      /// pair_view mostly looks like a std::pair,
      /// and can decay to a std::pair, but is really a pair of references
      template <typename ItA, typename ItB>
      struct pair_view {
          typedef typename ItA::value_type first_type;
          typedef typename ItB::value_type second_type;
      
          typedef std::pair<first_type, second_type> pair_type;
      
          pair_view() {}
          pair_view(const ItA &a, const ItB &b):
              first(*a), second(*b) {}
      
          pair_view &operator=(const pair_view &x)
              { first = x.first; second = x.second; return *this; }
          pair_view &operator=(const pair_type &x)
              { first = x.first; second = x.second; return *this; }
      
          typename ItA::reference first;
          typename ItB::reference second;
          operator pair_type() const
              { return std::make_pair(first, second); }
      
          friend bool operator==(const pair_view &a, const pair_view &b)
              { return (a.first == b.first) && (a.second == b.second); }
          friend bool operator<(const pair_view &a, const pair_view &b)
              { return (a.first < b.first) || ((a.first == b.first) && (a.second < b.second)); }
          friend bool operator!=(const pair_view &a, const pair_view &b)
              { return !(a == b); }
          friend bool operator>(const pair_view &a, const pair_view &b)
              { return (b < a); }
          friend bool operator<=(const pair_view &a, const pair_view &b)
              { return !(b < a); }
          friend bool operator>=(const pair_view &a, const pair_view &b)
              { return !(a < b); }
      
          friend bool operator==(const pair_view &a, const pair_type &b)
              { return (a.first == b.first) && (a.second == b.second); }
          friend bool operator<(const pair_view &a, const pair_type &b)
              { return (a.first < b.first) || ((a.first == b.first) && (a.second < b.second)); }
          friend bool operator!=(const pair_view &a, const pair_type &b)
              { return !(a == b); }
          friend bool operator>(const pair_view &a, const pair_type &b)
              { return (b < a); }
          friend bool operator<=(const pair_view &a, const pair_type &b)
              { return !(b < a); }
          friend bool operator>=(const pair_view &a, const pair_type &b)
              { return !(a < b); }
      
          friend bool operator==(const pair_type &a, const pair_type &b)
              { return (a.first == b.first) && (a.second == b.second); }
          friend bool operator<(const pair_type &a, const pair_type &b)
              { return (a.first < b.first) || ((a.first == b.first) && (a.second < b.second)); }
          friend bool operator!=(const pair_type &a, const pair_type &b)
              { return !(a == b); }
          friend bool operator>(const pair_type &a, const pair_type &b)
              { return (b < a); }
          friend bool operator<=(const pair_type &a, const pair_type &b)
              { return !(b < a); }
          friend bool operator>=(const pair_type &a, const pair_type &b)
              { return !(a < b); }
      };
      
      template <typename ItA, typename ItB>
      struct paired_iterator {
          // --- standard iterator traits
          typedef typename pair_view<ItA, ItB>::pair_type value_type;
          typedef pair_view<ItA, ItB> reference;
          typedef paired_iterator<ItA,ItB> pointer;
      
          typedef typename std::iterator_traits<ItA>::difference_type difference_type;
          typedef std::random_access_iterator_tag iterator_category;
      
          // --- methods not required by the Random Access Iterator concept
          paired_iterator(const ItA &a, const ItB &b):
              a(a), b(b) {}
      
          // --- iterator requirements
      
          // default construction
          paired_iterator() {}
      
          // copy construction and assignment
          paired_iterator(const paired_iterator &x):
              a(x.a), b(x.b) {}
          paired_iterator &operator=(const paired_iterator &x)
              { a = x.a; b = x.b; return *this; }
      
          // pre- and post-increment
          paired_iterator &operator++()
              { ++a; ++b; return *this; }
          paired_iterator operator++(int)
              { paired_iterator tmp(*this); ++(*this); return tmp; }
      
          // pre- and post-decrement
          paired_iterator &operator--()
              { --a; --b; return *this; }
          paired_iterator operator--(int)
              { paired_iterator tmp(*this); --(*this); return tmp; }
      
          // arithmetic
          paired_iterator &operator+=(const difference_type &n)
              { a += n; b += n; return *this; }
          friend paired_iterator operator+(const paired_iterator &x, const difference_type &n)
              { return paired_iterator(x.a+n, x.b+n); }
          friend paired_iterator operator+(const difference_type &n, const paired_iterator &x)
              { return paired_iterator(x.a+n, x.b+n); }
          paired_iterator &operator-=(const difference_type &n)
              { a -= n; b -= n; return *this; }
          friend paired_iterator operator-(const paired_iterator &x, const difference_type &n)
              { return paired_iterator(x.a-n, x.b-n); }
          friend difference_type operator-(const paired_iterator &x, const paired_iterator &y)
              { return (x.a - y.a); }
      
          // (in-)equality and ordering
          friend bool operator==(const paired_iterator &x, const paired_iterator &y)
              { return (x.a == y.a) && (x.b == y.b); }
          friend bool operator<(const paired_iterator &x, const paired_iterator &y)
              { return (x.a < y.a); }
      
          // derived (in-)equality and ordering operators
          friend bool operator!=(const paired_iterator &x, const paired_iterator &y)
              { return !(x == y); }
          friend bool operator>(const paired_iterator &x, const paired_iterator &y)
              { return (y < x); }
          friend bool operator<=(const paired_iterator &x, const paired_iterator &y)
              { return !(y < x); }
          friend bool operator>=(const paired_iterator &x, const paired_iterator &y)
              { return !(x < y); }
      
          // dereferencing and random access
          reference operator*() const
              { return reference(a,b); }
          reference operator[](const difference_type &n) const
              { return reference(a+n, b+n); }
      private:
          ItA a;
          ItB b;
      };
      
      template <typename ItA, typename ItB>
      paired_iterator<ItA, ItB> make_paired_iterator(const ItA &a, const ItB &b)
      { return paired_iterator<ItA, ItB>(a, b); }
      
      #endif
      
      
      #include <vector>
      #include <algorithm>
      #include <iostream>
      
      template <typename ItA, typename ItB>
      void print_kvs(const ItA &k0, const ItB &v0, const ItA &kn, const ItB &vn) {
          ItA k(k0);
          ItB v(v0);
          while (k != kn || v != vn) {
              if (k != kn && v != vn)
                  std::cout << "[" << *k << "] = " << *v << "\n";
              else if (k != kn)
                  std::cout << "[" << *k << "]\n";
              else if (v != vn)
                  std::cout << "[?] = " << *v << "\n";
      
              if (k != kn) ++k;
              if (v != vn) ++v;
          }
          std::cout << std::endl;
      }
      
      int main() {
          std::vector<int> keys;
          std::vector<std::string> data;
      
          keys.push_back(0); data.push_back("zero");
          keys.push_back(1); data.push_back("one");
          keys.push_back(2); data.push_back("two");
          keys.push_back(3); data.push_back("three");
          keys.push_back(4); data.push_back("four");
          keys.push_back(5); data.push_back("five");
          keys.push_back(6); data.push_back("six");
          keys.push_back(7); data.push_back("seven");
          keys.push_back(8); data.push_back("eight");
          keys.push_back(9); data.push_back("nine");
      
          print_kvs(keys.begin(), data.begin(), keys.end(), data.end());
      
          std::cout << "Shuffling\n";
          std::random_shuffle(
              make_paired_iterator(keys.begin(), data.begin()),
              make_paired_iterator(keys.end(), data.end())
          );
      
          print_kvs(keys.begin(), data.begin(), keys.end(), data.end());
      
          std::cout << "Sorting\n";
          std::sort(
              make_paired_iterator(keys.begin(), data.begin()),
              make_paired_iterator(keys.end(), data.end())
          );
      
          print_kvs(keys.begin(), data.begin(), keys.end(), data.end());
      
          std::cout << "Sort descending\n";
          std::sort(
              make_paired_iterator(keys.begin(), data.begin()),
              make_paired_iterator(keys.end(), data.end()),
              std::greater< std::pair<int,std::string> >()
          );
      
          print_kvs(keys.begin(), data.begin(), keys.end(), data.end());
      
          return 0;
      }
      

      【讨论】:

        【解决方案3】:

        你可以使用地图:

        int main() {
          vector<int> keys;
          vector<string> data;
          keys.push_back(5); data.push_back("joe");
          keys.push_back(2); data.push_back("yaochun");
          keys.push_back(1); data.push_back("holio");
        
          // load the keys and data to the map (they will automatically be inserted in sorted order by key)
          map<int, string> sortedVals;
          for(int i = 0; i < (int)keys.size(); ++i) {
            sortedVals[keys[i]] = data[i];
          }
        
          // copy the map values back to vectors  
          int ndx=0;
          for(map<int, string>::iterator it = sortedVals.begin(); it != sortedVals.end(); ++it) {
            keys[ndx] = it->first;
            data[ndx] = it->second;
            ++ndx;
          }
        
          // verify
          for(int i = 0; i < (int)keys.size(); ++i) {
            cout<<keys[i]<<" "<<data[i]<<endl;
          }
        
          return 0;
        }
        

        这是输出:

        ---------- Capture Output ----------
        > "c:\windows\system32\cmd.exe" /c c:\temp\temp.exe
        1 holio
        2 yaochun
        5 joe
        
        > Terminated with exit code 0.
        

        【讨论】:

          【解决方案4】:

          您可以使用函子进行排序,例如:

          template <class T>
          struct IndexFunctor {
            IndexFunctor(const std::vector<T>& v_) : v(v_) {}
            bool operator ()(int a, int b) const {
              return v[a] < v[b];
            }
            const std::vector<T>& v;
          };
          
          template <class K, class D>
          void SortByKeys(std::vector<K>& keys, std::vector<D>& data) {
            // Calculate the valid order inside a permutation array p.
            const int n = static_cast<int>(keys.size());
            std::vector<int> p(n);
            for (int i = 0; i < n; ++i) p[i] = i;
            std::sort(p.begin(), p.end(), IndexFunctor(keys));
          
            // Reorder the keys and data in temporary arrays, it cannot be done in place.
            std::vector<K> aux_keys(n);
            std::vector<D> aux_data(n);
            for (int i = 0; i < n; ++i) {
              aux_keys[i] = keys[p[i]];
              aux_data[i] = data[p[i]];
            }
          
            // Assign the ordered vectors by swapping, it should be faster.
            keys.swap(aux_keys);
            data.swap(aux_data);
          }
          

          【讨论】:

          • 很高兴在这里看到另一个 TC'er :)。
          • 谢谢 =)。我昨天加入了。
          • 你能解释一下你最后一句话的意思吗?对我来说,不需要任何额外存储的“明显”答案是定义一个新的迭代器类型,以在两个容器上提供一个单一的视图(给std::sort)。不过,这样做需要大量代码,所以我希望有人能提出一个更简洁的低开销解决方案。
          • @John:现在我虽然更好,但这只会帮助辅助向量,解决方案仍然需要p 向量。
          【解决方案5】:

          这个问题真的让我思考。我想出了一个解决方案,它利用一些 C++0x 特性来获得一个非常类似于 STL 的 parallel_sort 算法。为了执行“就地”排序,我必须写一个back_remove_iterator 作为back_insert_iterator 的对应项,以允许算法读取和写入同一个容器。您可以跳过这些部分,直接进入有趣的内容。

          我还没有通过任何硬核测试,但它似乎在时间和空间上都相当有效,主要是因为使用了std::move() 来防止不必要的复制。

          #include <algorithm>
          #include <iostream>
          #include <string>
          #include <vector>
          
          
          //
          // An input iterator that removes elements from the back of a container.
          // Provided only because the standard library neglects one.
          //
          template<class Container>
          class back_remove_iterator :
              public std::iterator<std::input_iterator_tag, void, void, void, void> {
          public:
          
          
              back_remove_iterator() : container(0) {}
              explicit back_remove_iterator(Container& c) : container(&c) {}
          
              back_remove_iterator& operator=
                  (typename Container::const_reference value) { return *this; }
          
              typename Container::value_type operator*() {
          
                  typename Container::value_type value(container->back());
                  container->pop_back();
                  return value;
          
              } // operator*()
          
              back_remove_iterator& operator++() { return *this; }
              back_remove_iterator operator++(int) { return *this; }
          
          
              Container* container;
          
          
          }; // class back_remove_iterator
          
          
          //
          // Equivalence operator for back_remove_iterator. An iterator compares equal
          // to the end iterator either if it is default-constructed or if its
          // container is empty.
          //
          template<class Container>
          bool operator==(const back_remove_iterator<Container>& a,
              const back_remove_iterator<Container>& b) {
          
              return !a.container ? !b.container || b.container->empty() :
                  !b.container ? !a.container || a.container->empty() :
                  a.container == b.container;
          
          } // operator==()
          
          
          //
          // Inequivalence operator for back_remove_iterator.
          //
          template<class Container>
          bool operator!=(const back_remove_iterator<Container>& a,
              const back_remove_iterator<Container>& b) {
          
              return !(a == b);
          
          } // operator!=()
          
          
          //
          // A handy way to default-construct a back_remove_iterator.
          //
          template<class Container>
          back_remove_iterator<Container> back_remover() {
          
              return back_remove_iterator<Container>();
          
          } // back_remover()
          
          
          //
          // A handy way to construct a back_remove_iterator.
          //
          template<class Container>
          back_remove_iterator<Container> back_remover(Container& c) {
          
              return back_remove_iterator<Container>(c);
          
          } // back_remover()
          
          
          //
          // A comparison functor that sorts std::pairs by their first element.
          //
          template<class A, class B>
          struct sort_pair_by_first {
          
              bool operator()(const std::pair<A, B>& a, const std::pair<A, B>& b) {
          
                  return a.first < b.first;
          
              } // operator()()
          
          }; // struct sort_pair_by_first
          
          
          //
          // Performs a parallel sort of the ranges [keys_first, keys_last) and
          // [values_first, values_last), preserving the ordering relation between
          // values and keys. Sends key and value output to keys_out and values_out.
          //
          // This works by building a vector of std::pairs, sorting them by the key
          // element, then returning the sorted pairs as two separate sequences. Note
          // the use of std::move() for a vast performance improvement.
          //
          template<class A, class B, class I, class J, class K, class L>
          void parallel_sort(I keys_first, I keys_last, J values_first, J values_last,
                             K keys_out, L values_out) {
          
              typedef std::vector< std::pair<A, B> > Pairs;
              Pairs sorted;
          
              while (keys_first != keys_last)
                  sorted.push_back({std::move(*keys_first++), std::move(*values_first++)});
          
              std::sort(sorted.begin(), sorted.end(), sort_pair_by_first<A, B>());
          
              for (auto i = sorted.begin(); i != sorted.end(); ++i)
                  *keys_out++ = std::move(i->first),
                  *values_out++ = std::move(i->second);
          
          } // parallel_sort()
          
          
          int main(int argc, char** argv) {
          
              //
              // There is an ordering relation between keys and values,
              // but the sets still need to be sorted. Sounds like a job for...
              //
              std::vector<int> keys{0, 3, 1, 2};
              std::vector<std::string> values{"zero", "three", "one", "two"};
          
              //
              // parallel_sort! Unfortunately, the key and value types do need to
              // be specified explicitly. This could be helped with a utility
              // function that accepts back_remove_iterators.
              //
              parallel_sort<int, std::string>
                  (back_remover(keys), back_remover<std::vector<int>>(),
                  back_remover(values), back_remover<std::vector<std::string>>(),
                  std::back_inserter(keys), std::back_inserter(values));
          
              //
              // Just to prove that the mapping is preserved.
              //
              for (unsigned int i = 0; i < keys.size(); ++i)
                  std::cout << keys[i] << ": " << values[i] << '\n';
          
              return 0;
          
          } // main()
          

          我希望这被证明是有用的,或者至少是有趣的。

          【讨论】:

            【解决方案6】:

            事实证明 Boost 包含一个迭代器,它的作用与 my other answer 中的 paired_iterator 所做的差不多:

            Boost.Iterator Zip Iterator

            这似乎是最好的选择。

            【讨论】:

            • 我没有设法使用 zip_iterator 进行这种排序。你能解释一下怎么做吗?
            【解决方案7】:

            我不知道知道std::swap实现细节之后的利用是否是UB。我认为“不”。

            #include <iostream>
            #include <iomanip>
            
            #include <type_traits>
            #include <utility>
            #include <iterator>
            #include <algorithm>
            #include <numeric>
            #include <deque>
            #include <forward_list>
            #include <vector>
            
            #include <cstdlib>
            #include <cassert>
            
            template< typename pattern_iterator, typename target_iterator >
            void
            pattern_sort(pattern_iterator pbeg, pattern_iterator pend, target_iterator tbeg, target_iterator tend)
            {
                //assert(std::distance(pbeg, pend) == std::distance(tbeg, tend));
                using pattern_traits = std::iterator_traits< pattern_iterator >;
                using target_traits = std::iterator_traits< target_iterator >;
                static_assert(std::is_base_of< std::forward_iterator_tag, typename pattern_traits::iterator_category >{});
                static_assert(std::is_base_of< std::forward_iterator_tag, typename target_traits::iterator_category >{});
                struct iterator_adaptor
                {
            
                    iterator_adaptor(typename pattern_traits::reference pattern,
                                     typename target_traits::reference target)
                        : p(&pattern)
                        , t(&target)
                    { ; }
            
                    iterator_adaptor(iterator_adaptor &&)
                        : p(nullptr)
                        , t(nullptr)
                    { ; }
            
                    void
                    operator = (iterator_adaptor && rhs) &
                    {
                        if (!!rhs.p) {
                            assert(!!rhs.t);
                            std::swap(p, rhs.p);
                            std::iter_swap(t, rhs.t);
                        }
                    }
            
                    bool
                    operator < (iterator_adaptor const & rhs) const
                    {
                        return (*p < *rhs.p);
                    }
            
                private :
            
                    typename pattern_traits::pointer p;
                    typename target_traits::pointer t;
            
                };
                std::deque< iterator_adaptor > proxy_; // std::vector can be used instead
                //proxy_.reserve(static_cast< std::size_t >(std::distance(pbeg, pend))); // it's (maybe) worth it if proxy_ is std::vector and if walking through whole [tbeg, tend) range is not too expensive operation (in case if target_iterator is worse then RandomAccessIterator)
                auto t = tbeg;
                auto p = pbeg;
                while (p != pend) {
                    assert(t != tend);
                    proxy_.emplace_back(*p, *t);
                    ++p;
                    ++t;
                }
                std::sort(std::begin(proxy_), std::end(proxy_));
            }
            
            int
            main()
            {
                std::forward_list< int > keys{5, 4, 3, 2, 1};
                std::vector< std::size_t > indices(static_cast< std::size_t >(std::distance(std::cbegin(keys), std::cend(keys))));
                std::iota(std::begin(indices), std::end(indices), std::size_t{0}); // indices now: 0, 1, 2, 3, 4    
                std::copy(std::cbegin(keys), std::cend(keys), std::ostream_iterator< int >(std::cout, " ")); std::cout << std::endl;
                std::copy(std::cbegin(indices), std::cend(indices), std::ostream_iterator< std::size_t >(std::cout, " ")); std::cout << std::endl;
                pattern_sort(std::cbegin(keys), std::cend(keys), std::begin(indices), std::end(indices)); std::cout << std::endl;
                std::copy(std::cbegin(keys), std::cend(keys), std::ostream_iterator< int >(std::cout, " ")); std::cout << std::endl;
                std::copy(std::cbegin(indices), std::cend(indices), std::ostream_iterator< std::size_t >(std::cout, " ")); std::cout << std::endl;
                // now one can use indices to access keys and data to as ordered containers
                return EXIT_SUCCESS;
            }
            

            【讨论】:

              猜你喜欢
              • 1970-01-01
              • 2015-06-24
              • 2021-12-14
              • 1970-01-01
              • 1970-01-01
              • 1970-01-01
              • 2011-03-13
              • 1970-01-01
              • 1970-01-01
              相关资源
              最近更新 更多