替换 std::string 中字符的最快（和最安全）方法答案

【问题标题】：Fastest (and safest) method to replace characters in std::string替换 std::string 中字符的最快（和最安全）方法
【发布时间】：2015-12-12 05:25:11
【问题描述】：

我必须替换长字符串中的几个（固定）数量的字符：我想知道什么是最快但符合标准的方式。

这是一个包含 6 种不同方法的示例代码；在方法的注释中，我添加了在测试环境中执行 100 万次操作的时间（以毫秒为单位），并启用了优化。

const char* pluto = "Cia1234567Ciao!";
std::string rep = "87654321";
std::string r1 = pluto, r2 = pluto, r3 = pluto, r4 = pluto, r5 = pluto, r6 = pluto;

// (1) 300 msec
r1.replace(3, 7, rep.substr(1));  

// (2) 40 msec
std::copy(rep.begin() + 1, rep.end(), r2.begin() + 3);

// (3) 32 msec
for (int i = 1; i < 8; ++i)
    r3[2 + i] = rep[i];

// (4) 14 msec
{
    const char *c = rep.c_str() + 1;
    for (int i = 0; i < 7; ++i)
        r4[3 + i] = *c++;
}

// (5) 3 msec (BEST)
memcpy(&r5[3], &rep[1], 7);

// (6) 100 msec
r6.replace(3, 7, rep.c_str() + 1);

所以最快的方法似乎是（5），但我担心这种方法可能无法与许多编译器使用的“copy-on-write”std::string 优化一起正常工作。

恕我直言（5）也更具可读性。

我想知道为什么（4）是（3）的两倍，我认为std::string 的operator[] 已经相当优化了......

更新：

阅读 cmets 后，我更新了我的代码以使用 google 基准库，并且 (3) 和 (4) 的结果似乎相同，其他差异仍然适用：

Run on (2 X 3000 MHz CPU s)
2015-11-24 14:46:50
Benchmark                   Time(ns)    CPU(ns) Iterations
-----------------------------------------------------------
(1) bench_replace_substr        293        264     2651515
(2) bench_std_copy               39         39    19662921
(3) bench_op_bracket             15         15    39772727
(4) bench_op_bracket_2           15         15    44871795
(5) bench_memcpy                  4          4    75000000
(6) bench_replace                80         80     8333333

所以 (3) 和 (4) 中的差异消失了，但其余的结果是相同的:)

【问题讨论】：

自 C++11 起，std::string 不再允许使用 COW。
是否启用了优化？在我看来，它是 15 3 0 0 0 4 以微秒为单位。
7 字节的 memcpy() 需要 3 毫秒？您的绩效数据有问题。
这些数字看起来很可疑（但至少提供它们做得很好；））。例如。我认为 3) 和 4) 之间的巨大差异绝对没有理由。实际上，如此快速的基准测试操作可能非常困难。例如，您确定要测量的操作在某些情况下没有得到优化吗？ This 是这样一个关于 Microbenchmarks 的有趣演讲。
@gabry 这并不是真的表明编译器没有优化所有内容，而是最终结果。

标签： c++ string optimization

【解决方案1】：

使用memcpy 的方法至少从 C++11 开始就符合标准，因为

正如this answer 中所解释的，std::string 的写时复制实现是不允许的，因为它违反了标准的迭代器/引用失效要求。
std::string 的字符存储在连续内存中，引用 21.4.1.5：

basic_string 对象中的类字符对象应连续存储。也就是说，对于任何 basic_string 对象 s，标识 &*(s.begin() + n) == &*s.begin() + n 应适用于 n 的所有值，例如 0

因此，它是您列表中最快的符合标准的方法（至少根据您的基准测试结果）。

事实上，即使使用写时复制的non-standard-compliant implementation，这也应该是安全的，因为非常量operator[] 应该复制字符串，例如：

std::string s1("foo");
std::string s2 = s1;
std::cout << static_cast<const void*>(s1.data()) << " "
          << static_cast<const void*>(s2.data()) << "\n";
s2[0];
std::cout << static_cast<const void*>(s1.data()) << " "
          << static_cast<const void*>(s2.data()) << "\n";

打印

0x1782028 0x1782028
0x1782028 0x1782058

当我使用 gcc 4.8.4 和相当旧版本的 libstdc++ 编译它并运行时。请注意，调用非常量 operator[] 后指针不同，这意味着数据已被复制。

知道非 const operator[] 会在 COW 实现中进行一些检查，调用 const operator[] 可能会加快速度：

const std::string &crep = rep;
memcpy(&r5[3], &crep[1], 7);

这在我的系统上确实更快：

Benchmark              Time(ns)    CPU(ns) Iterations
-----------------------------------------------------
bench_memcpy_const            2          2  314215561 
bench_memcpy                  3          3  276899830

【讨论】：