【问题标题】:How to index and query STL map containers by multiple keys?如何通过多个键索引和查询 STL 映射容器?
【发布时间】:2010-12-23 22:02:48
【问题描述】:

我遇到了一个记录存储为的要求

Name :  Employee_Id  :  Address

其中 Name 和 Employee_Id 应该是键,即在 Name 和 Employee Id 上都提供搜索功能。

我可以考虑用一个map来存储这个结构

std::map< std:pair<std::string,std::string> , std::string >  
//      <         < Name   ,   Employee-Id> , Address     > 

但我不确定搜索功能会是什么样子。

【问题讨论】:

  • Boost Multi Index Container 处理得非常好boost.org/doc/libs/1_41_0/libs/multi_index/doc/index.html
  • 您只能为给定的地图提供一个排序标准。使用普通 STL,您可以使用两个映射,一个按 Name 排序,另一个按 Employee_Id 排序,或者使用 Boost Multi Index 容器。

标签: c++ stl containers


【解决方案1】:

Boost.Multiindex

这是Boost example

在上面的示例中使用了有序索引,但您也可以使用散列索引:

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <string>
#include <iostream>

struct employee
{
    int         id_;
    std::string name_;
    std::string address_;

    employee(int id,std::string name,std::string address):id_(id),name_(name),address_(address) {}

};

struct id{};
struct name{};
struct address{};
struct id_hash{};
struct name_hash{};


typedef boost::multi_index_container<
    employee,
    boost::multi_index::indexed_by<
        boost::multi_index::ordered_unique<boost::multi_index::tag<id>,  BOOST_MULTI_INDEX_MEMBER(employee,int,id_)>,
        boost::multi_index::ordered_unique<boost::multi_index::tag<name>,BOOST_MULTI_INDEX_MEMBER(employee,std::string,name_)>,
        boost::multi_index::ordered_unique<boost::multi_index::tag<address>, BOOST_MULTI_INDEX_MEMBER(employee,std::string,address_)>,
        boost::multi_index::hashed_unique<boost::multi_index::tag<id_hash>,  BOOST_MULTI_INDEX_MEMBER(employee,int,id_)>,
        boost::multi_index::hashed_unique<boost::multi_index::tag<name_hash>,  BOOST_MULTI_INDEX_MEMBER(employee,std::string,name_)>
    >
> employee_set;

typedef boost::multi_index::index<employee_set,id>::type employee_set_ordered_by_id_index_t;
typedef boost::multi_index::index<employee_set,name>::type employee_set_ordered_by_name_index_t;
typedef boost::multi_index::index<employee_set,name_hash>::type employee_set_hashed_by_name_index_t;

typedef boost::multi_index::index<employee_set,id>::type::const_iterator  employee_set_ordered_by_id_iterator_t;
typedef boost::multi_index::index<employee_set,name>::type::const_iterator  employee_set_ordered_by_name_iterator_t;


typedef boost::multi_index::index<employee_set,id_hash>::type::const_iterator  employee_set_hashed_by_id_iterator_t;
typedef boost::multi_index::index<employee_set,name_hash>::type::const_iterator  employee_set_hashed_by_name_iterator_t;


int main()
{
    employee_set employee_set_;

    employee_set_.insert(employee(1, "Employer1", "Address1"));
    employee_set_.insert(employee(2, "Employer2", "Address2"));
    employee_set_.insert(employee(3, "Employer3", "Address3"));
    employee_set_.insert(employee(4, "Employer4", "Address4"));

    // search by id using an ordered index 
    {
        const employee_set_ordered_by_id_index_t& index_id = boost::multi_index::get<id>(employee_set_);
        employee_set_ordered_by_id_iterator_t id_itr = index_id.find(2);
        if (id_itr != index_id.end() ) {
            const employee& tmp = *id_itr;
            std::cout << tmp.id_ << ", " << tmp.name_ << ", "  << tmp .address_ << std::endl;
        } else {
            std::cout << "No records have been found\n";
        }
    }

    // search by non existing id using an ordered index 
    {
        const employee_set_ordered_by_id_index_t& index_id = boost::multi_index::get<id>(employee_set_);
        employee_set_ordered_by_id_iterator_t id_itr = index_id.find(2234);
        if (id_itr != index_id.end() ) {
            const employee& tmp = *id_itr;
            std::cout << tmp.id_ << ", " << tmp.name_ << ", "  << tmp .address_ << std::endl;
        } else {
            std::cout << "No records have been found\n";
        }
    }

    // search by name using an ordered index
    {
        const employee_set_ordered_by_name_index_t& index_name = boost::multi_index::get<name>(employee_set_);
        employee_set_ordered_by_name_iterator_t name_itr = index_name.find("Employer3");
        if (name_itr != index_name.end() ) {
            const employee& tmp = *name_itr;
            std::cout << tmp.id_ << ", " << tmp.name_ << ", "  << tmp .address_ << std::endl;
        } else {
            std::cout << "No records have been found\n";
        }
    }

    // search by name using an hashed index
    {
        employee_set_hashed_by_name_index_t& index_name = boost::multi_index::get<name_hash>(employee_set_);
        employee_set_hashed_by_name_iterator_t name_itr = index_name.find("Employer4");
        if (name_itr != index_name.end() ) {
            const employee& tmp = *name_itr;
            std::cout << tmp.id_ << ", " << tmp.name_ << ", "  << tmp .address_ << std::endl;
        } else {
            std::cout << "No records have been found\n";
        }
    }

    // search by name using an hashed index but the name does not exists in the container
    {
        employee_set_hashed_by_name_index_t& index_name = boost::multi_index::get<name_hash>(employee_set_);
        employee_set_hashed_by_name_iterator_t name_itr = index_name.find("Employer46545");
        if (name_itr != index_name.end() ) {
            const employee& tmp = *name_itr;
            std::cout << tmp.id_ << ", " << tmp.name_ << ", "  << tmp .address_ << std::endl;
        } else {
            std::cout << "No records have been found\n";
        }
    }

    return 0;
}

【讨论】:

    【解决方案2】:

    如果 EmployeeID 是唯一标识符,为什么要使用其他键?我会在任何地方都使用 EmployeeID 作为内部键,并有从外部/人类可读的 ID(例如名称)到它的其他映射。

    【讨论】:

    • 从性能的角度来看,这意味着要按名称获取项目,您需要执行 2 次索引搜索而不是 1 次。虽然索引搜索自然很快,但当磁盘访问被关闭时,性能会受到更大的影响涉及,因为您将进行另一次磁盘读取。
    【解决方案3】:

    如果你想使用std::map,你可以有两个独立的容器,每个容器都有不同的键(name,emp id),值应该是结构的指针,这样你就不会有多个副本相同的数据。

    【讨论】:

    • 这是一种可能。但是在这种情况下定义所有权会出现一个明显的问题。它是第一个容器,第二个容器还是所有权都位于它们之外。如果您是唯一一个编写和查看代码的人,则这不必成为问题。但是,一旦其他人开始使用您的容器,肯定会出现混乱。
    【解决方案4】:

    以 tew 键为例:

    #include <memory>
    #include <map>
    #include <iostream>
    
    template <class KEY1,class KEY2, class OTHER >
    class MultiKeyMap {
      public:
      struct Entry
      {
        KEY1 key1;
        KEY2 key2;
        OTHER otherVal;
        Entry( const KEY1 &_key1,
               const KEY2 &_key2,
               const OTHER &_otherVal):
               key1(_key1),key2(_key2),otherVal(_otherVal) {};
        Entry() {};
      };
      private:
      struct ExtendedEntry;
      typedef std::shared_ptr<ExtendedEntry> ExtendedEntrySptr;
      struct ExtendedEntry {
        Entry entry;
        typename std::map<KEY1,ExtendedEntrySptr>::iterator it1;
        typename std::map<KEY2,ExtendedEntrySptr>::iterator it2;
        ExtendedEntry() {};
        ExtendedEntry(const Entry &e):entry(e) {};
      };
      std::map<KEY1,ExtendedEntrySptr> byKey1;
      std::map<KEY2,ExtendedEntrySptr> byKey2;
    
      public:
      void del(ExtendedEntrySptr p)
      {
        if (p)
        {
          byKey1.erase(p->it1);
          byKey2.erase(p->it2);
        }
      }
    
      void insert(const Entry &entry) {
        auto p=ExtendedEntrySptr(new ExtendedEntry(entry));
        p->it1=byKey1.insert(std::make_pair(entry.key1,p)).first;
        p->it2=byKey2.insert(std::make_pair(entry.key2,p)).first;
      }
      std::pair<Entry,bool> getByKey1(const KEY1 &key1) 
      {
        const auto &ret=byKey1[key1];
        if (ret)
          return std::make_pair(ret->entry,true);
        return std::make_pair(Entry(),false);
      }
      std::pair<Entry,bool> getByKey2(const KEY2 &key2) 
      {
        const auto &ret=byKey2[key2];
        if (ret)
          return std::make_pair(ret->entry,true);
        return std::make_pair(Entry(),false);
      }
      void deleteByKey1(const KEY1 &key1)
      {
        del(byKey1[key1]);
      }
      void deleteByKey2(const KEY2 &key2)
      {
        del(byKey2[key2]);
      }
    };
    
    
    int main(int argc, const char *argv[])
    {
      typedef MultiKeyMap<int,std::string,int> M;
      M map1;
      map1.insert(M::Entry(1,"aaa",7));
      map1.insert(M::Entry(2,"bbb",8));
      map1.insert(M::Entry(3,"ccc",9));
      map1.insert(M::Entry(7,"eee",9));
      map1.insert(M::Entry(4,"ddd",9));
      map1.deleteByKey1(7);
      auto a=map1.getByKey1(2);
      auto b=map1.getByKey2("ddd");
      auto c=map1.getByKey1(7);
      std::cout << "by key1=2   (should be bbb ): "<< (a.second ? a.first.key2:"Null") << std::endl;
      std::cout << "by key2=ddd (should be ddd ): "<< (b.second ? b.first.key2:"Null") << std::endl;
      std::cout << "by key1=7   (does not exist): "<< (c.second ? c.first.key2:"Null") << std::endl;
      return 0;
    }
    

    输出:

    by key1=2   (should be bbb ): bbb
    by key2=ddd (should be ddd ): ddd
    by key1=7   (does not exist): Null
    

    【讨论】:

    【解决方案5】:

    C++14 std::set::find 非关键搜索解决方案

    此方法使您免于存储键两次,一次是索引对象,第二次作为映射的键,如:https://stackoverflow.com/a/44526820/895245

    这提供了最简单的核心技术示例,首先应该更容易理解:How to make a C++ map container where the key is part of the value?

    #include <cassert>
    #include <set>
    #include <vector>
    
    struct Point {
        int x;
        int y;
        int z;
    };
    
    class PointIndexXY {
        public:
            void insert(Point *point) {
                sx.insert(point);
                sy.insert(point);
            }
            void erase(Point *point) {
                sx.insert(point);
                sy.insert(point);
            }
            Point* findX(int x) {
                return *(this->sx.find(x));
            }
            Point* findY(int y) {
                return *(this->sy.find(y));
            }
        private:
            struct PointCmpX {
                typedef std::true_type is_transparent;
                bool operator()(const Point* lhs, int rhs) const { return lhs->x < rhs; }
                bool operator()(int lhs, const Point* rhs) const { return lhs < rhs->x; }
                bool operator()(const Point* lhs, const Point* rhs) const { return lhs->x < rhs->x; }
            };
            struct PointCmpY {
                typedef std::true_type is_transparent;
                bool operator()(const Point* lhs, int rhs) const { return lhs->y < rhs; }
                bool operator()(int lhs, const Point* rhs) const { return lhs < rhs->y; }
                bool operator()(const Point* lhs, const Point* rhs) const { return lhs->y < rhs->y; }
            };
            std::set<Point*, PointCmpX> sx;
            std::set<Point*, PointCmpY> sy;
    };
    
    int main() {
        std::vector<Point> points{
            {1, -1, 1},
            {2, -2, 4},
            {0,  0, 0},
            {3, -3, 9},
        };
        PointIndexXY idx;
        for (auto& point : points) {
            idx.insert(&point);
        }
        Point *p;
        p = idx.findX(0);
        assert(p->y == 0 && p->z == 0);
        p = idx.findX(1);
        assert(p->y == -1 && p->z == 1);
        p = idx.findY(-2);
        assert(p->x == 2 && p->z == 4);
    }
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2016-02-20
      • 2013-01-25
      • 2011-10-21
      • 1970-01-01
      • 2019-09-25
      • 2022-10-06
      • 1970-01-01
      相关资源
      最近更新 更多