我用 MinGW (TDM) 4.8.1 编译它,带有选项 -fdump-tree-optimized,没有 -O2
第一个动作像
string tmp = a+b; // that mean create new string g, g += b, tmp = g (+dispose g)
tmp += c;
return tmp; // and dispose tmp
第二个用另一种方式做
string tmp = a; // just copy a to tmp
tmp += b;
tmp += c;
return tmp; // and dispose tmp
看起来像这样
void * D.20477;
struct basic_string D.20179;
<bb 2>:
D.20179 = std::operator+<char, std::char_traits<char>, std::allocator<char> > (a_1(D), b_2(D)); [return slot optimization]
*_3(D) = std::operator+<char, std::char_traits<char>, std::allocator<char> > (&D.20179, c_4(D)); [return slot optimization]
<bb 3>:
<bb 4>:
std::basic_string<char>::~basic_string (&D.20179);
D.20179 ={v} {CLOBBER};
<L1>:
return _3(D);
<L2>:
std::basic_string<char>::~basic_string (&D.20179);
_5 = __builtin_eh_pointer (1);
__builtin_unwind_resume (_5);
和
void * D.20482;
struct string r [value-expr: *<retval>];
<bb 2>:
std::basic_string<char>::basic_string (r_1(D), a_2(D));
std::basic_string<char>::operator+= (r_1(D), b_3(D));
<bb 3>:
std::basic_string<char>::operator+= (r_1(D), c_4(D));
<bb 4>:
<L0>:
return r_1(D);
<L1>:
std::basic_string<char>::~basic_string (r_1(D));
_5 = __builtin_eh_pointer (1);
__builtin_unwind_resume (_5);
因此,在应用 -O2 优化后,编译器将 ConcatB 函数保持在几乎相同的视图中,并通过内联函数、向内存分配部分添加常量值、声明新函数来使 ConcatA 发挥一些魔力,但最有价值的部分保持不变。
连接A:
D.20292 = std::operator+<char, std::char_traits<char>, std::allocator<char> > (a_2(D), b_3(D)); [return slot optimization]
*_5(D) = std::operator+<char, std::char_traits<char>, std::allocator<char> > (&D.20292, c_6(D));
连接B:
std::basic_string<char>::basic_string (r_3(D), a_4(D));
std::basic_string<char>::append (r_3(D), b_6(D));
std::basic_string<char>::append (r_3(D), c_8(D));
所以,很明显 ConcatB 比 ConcatA 好,因为它执行的分配操作更少,当您尝试优化这么小的代码时,这非常昂贵。