尝试创建线程安全的 std::map答案

【问题标题】：Attemption to create thread safe std::map尝试创建线程安全的 std::map
【发布时间】：2014-10-15 20:28:21
【问题描述】：

假设我们有std::map 容器，我们想让它在插入、擦除、搜索和编辑记录方面是线程安全的。同时我们希望线程可以并行处理不同的记录（读取和编辑记录）。为此，我为记录编辑操作创建了一个单独的类，它使用互斥锁进行保护。

class Data
{
public:
    Data(const std::string& data) : _mutex(), _data(data) { }
    void setData(const std::string& data)
    {
        std::lock_guard<std::mutex> locker(_mutex);
        _data = data;
    }

    const std::string& getData() const { return _data; }

private:
    std::mutex _mutex;
    std::string _data;
};

class Storage
{
public:
    void insertData(size_t key, const std::string& data)
    {
        std::lock_guard<std::mutex> locker(_mutex);
        _storage[key] = data;
    }

    void eraseData(size_t key)
    {
        std::lock_guard<std::mutex> locker(_mutex);
        _storage.erase(key);
    }

    const std::string& getData(size_t key) const { return _storage[key].getData(); }

    void setData(size_t key, const std::string& data) { _storage[key].setData(data); }

private:
    std::mutex _mutex;
    std::map<size_t, Data> _storage;
};

现在假设线程抓取了要编辑的某些记录的“本地”互斥锁（Data::setData 方法调用）。同时，其他线程抓取“全局”互斥体来删除这条记录（Storage::eraseData 方法调用）——有什么问题吗？这段代码还可能出现什么问题？

【问题讨论】：

树型数据结构对多线程根本不友好。
我还会查看 TBB 的 threadsafe map 实现作为线程安全容器库的一部分。
@PetrPervukhin：他们可以。特别是如果它们被设计为单独锁定节点并且有点草率并且在每次更新时都不平衡。因为如果线程在独立的树节点上运行，它们就不会相互干扰。
@ZanLynx：你是对的。但是如果有很多插入/删除，我仍然填写跳过列表会更好。但是我没有测试过它们。

标签： c++ multithreading c++11 mutex

【解决方案1】：

首先解决您的并发问题。这是一个 C++14 解决方案，因为 C++11 版本更加冗长，而且我们没有我们想要的所有锁定原语：

template<class T>
struct thread_safe {
  template<class F>
  auto read( F&& f ) const {
    std::shared_lock<decltype(mutex)> lock(mutex);
    return std::forward<F>(f)(t);
  }
  template<class F>
  auto write( F&& f ) {
    std::unique_lock<decltype(mutex)> lock(mutex);
    return std::forward<F>(f)(t);
  }
  template<class O>
  thread_safe(O&&o):t(std::forward<O>(o)) {}

  thread_safe() = default;

  operator T() const {
    return o.read([](T const& t){return t;});
  }

  // it is really this simple:
  thread_safe( thread_safe const& o ):t( o ) {}

  // forward to above thread safe copy ctor:
  thread_safe( thread_safe & o ):thread_safe( const_cast<thread_safe const&>(o) ) {}
  thread_safe( thread_safe && o ):thread_safe(o) {}
  thread_safe( thread_safe const&& o ):thread_safe(o) {}

  thead_safe& operator=( thread_safe const& o ) {
    write( [&o](auto& target) {
      target = o;
    });
    return *this;
  } 
  template<class O>
  thread_safe& operator=( O&& o ) {
    write([&o](auto& t){ t = std::forward<O>(o); });
    return *this;
  }
private:
  T t;
  mutable std::shared_timed_mutex mutex;
};

这是一个任意类的线程安全包装器。

我们可以直接使用这个：

typedef thread_safe< std::map< size_t, thread_safe<std::string> > > my_map;

这里有我们的两级线程安全映射。

使用示例，将条目 33 设置为 "hello"：

my_map.write( [&](auto&& m){
  m[33] = "hello";
} );

这在每个元素和整个地图上都有多个阅读器和单个编写器。从 read 或 write 调用返回迭代器是不安全的。

当然，您应该测试和审核上述代码。我没有。

核心思想非常简单。要阅读，您必须 .read 线程安全对象。您传入的 lambda 会为基础数据获取 const&。在std:: 数据上，这些数据保证是多阅读器安全的。

要写信，你必须.write。这会获得一个排他锁，阻止其他.reads。此处的 lambda 获取底层数据的 &。

我添加了operator T 和= 和copy-construct 以使类型更规则。这样做的代价是您可能会意外生成大量锁定/解锁行为。优点是m[33] = "hello" 可以正常工作，太棒了。

【讨论】：

@gorill 正如一位朋友所说，内部的thread_safe<std::string> 大多是多余的，因为为了获得非常量的std::string，您需要拥有非const 的访问权限map，这意味着你已经拥有了一个独特的锁。如果你让它std::unique_ptr<thread_safe<std::string>> 不再是真的。
@gorill 核心问题是容器的一些非const 方法是线程安全的-const（即读操作，即使它们不是const），比如begin()、end()、find()。我不确定处理该问题的最佳方法。嗯。
@Yakk，最好的方法是以线程安全的方式完全重新实现 rb-tree))))

【解决方案2】：

你有两个大问题：

如果一个线程调用insertData，同时另一个线程调用getData，会发生什么？对operator[] 的调用可能会崩溃，因为在尝试访问地图时正在修改地图。
如果一个线程调用eraseData 而另一个线程仍在使用它从getData 返回的引用，会发生什么？引用可能无效，导致崩溃。

【讨论】：