了解 std::accumulate答案

【问题标题】：Understanding std::accumulate了解 std::accumulate
【发布时间】：2012-09-28 05:13:00
【问题描述】：

我想知道为什么需要std::accumulate（又名reduce）第三个参数。对于那些不知道accumulate是什么的人，它是这样使用的：

vector<int> V{1,2,3};  
int sum = accumulate(V.begin(), V.end(), 0);
// sum == 6

调用accumulate相当于：

sum = 0;  // 0 - value of 3rd param
for (auto x : V)  sum += x;

还有可选的第四个参数，允许用任何其他操作替换加法。

我听说的基本原理是，如果您需要说不相加，而是将向量的元素相乘，我们需要其他（非零）初始值：

vector<int> V{1,2,3};
int product = accumulate(V.begin(), V.end(), 1, multiplies<int>());

但是为什么不像 Python 那样做 - 为V.begin() 设置初始值，并使用从V.begin()+1 开始的范围。像这样的：

int sum = accumulate(V.begin()+1, V.end(), V.begin());

这适用于任何操作。为什么需要第三个参数？

【问题讨论】：

如果V 为空，accumulate(V.begin()+1, V.end(), V.begin()); 不起作用。
另一个原因是总和不再适合该类型，因此您必须指定一个可以保存该值的不同类型。

标签： c++ stl accumulate

【解决方案1】：

您的假设是错误的：T 类型与 InputIterator 类型相同。

但std::accumulate 是通用的，允许各种不同的创意积累和减少。

示例 #1：累计员工工资

这是一个简单的例子：一个Employee 类，有很多数据字段。

class Employee {
/** All kinds of data: name, ID number, phone, email address... */
public:
 int monthlyPay() const;
};

您无法有意义地“积累”一组员工。这是没有意义的;它是未定义的。但是，您可以定义关于员工的累积。假设我们要汇总所有所有名员工的月薪。 std::accumulate 可以做到：

/** Simple class defining how to add a single Employee's
 *  monthly pay to our existing tally */
auto accumulate_func = [](int accumulator, const Employee& emp) {
   return accumulator + emp.monthlyPay();
 };

// And here's how you call the actual calculation:
int TotalMonthlyPayrollCost(const vector<Employee>& V)
{
 return std::accumulate(V.begin(), V.end(), 0, accumulate_func);
}

所以在这个例子中，我们在 Employee 对象的集合上累积一个 int 值。在这里，累加和与我们实际求和的变量类型不同。

示例 #2：累积平均值

您也可以将accumulate 用于更复杂的累加类型 - 可能希望将值附加到向量；也许您在输入中跟踪了一些神秘的统计数据；等等。你积累的东西不有只是一个数字;它可能更复杂。

例如，下面是一个使用accumulate 计算整数向量平均值的简单示例：

// This time our accumulator isn't an int -- it's a structure that lets us
// accumulate an average.
struct average_accumulate_t
{
    int sum;
    size_t n;
    double GetAverage() const { return ((double)sum)/n; }
};

// Here's HOW we add a value to the average:
auto func_accumulate_average = 
    [](average_accumulate_t accAverage, int value) {
        return average_accumulate_t(
            {accAverage.sum+value, // value is added to the total sum
            accAverage.n+1});      // increment number of values seen
    };

double CalculateAverage(const vector<int>& V)
{
    average_accumulate_t res =
        std::accumulate(V.begin(), V.end(), average_accumulate_t({0,0}), func_accumulate_average)
    return res.GetAverage();
}

示例 #3：累积移动平均值

您需要初始值的另一个原因是，该值并非始终您正在进行的计算的默认/中性值。

让我们以我们已经看到的普通示例为基础。但是现在，我们想要一个能够保持 running 平均值的类——也就是说，我们可以在多个调用中不断输入新值，并检查 到目前为止 的平均值.

class RunningAverage
{
    average_accumulate_t _avg;
public:
    RunningAverage():_avg({0,0}){} // initialize to empty average

    double AverageSoFar() const { return _avg.GetAverage(); }

    void AddValues(const vector<int>& v)
    {
        _avg = std::accumulate(v.begin(), v.end(), 
            _avg, // NOT the default initial {0,0}!
            func_accumulate_average);
    }

};

int main()
{
    RunningAverage r;
    r.AddValues(vector<int>({1,1,1}));
    std::cout << "Running Average: " << r.AverageSoFar() << std::endl; // 1.0
    r.AddValues(vector<int>({-1,-1,-1}));
    std::cout << "Running Average: " << r.AverageSoFar() << std::endl; // 0.0
}

在这种情况下，我们绝对依赖能够为 std::accumulate 设置初始值 - 我们需要能够从不同的起点初始化累积。

总之，std::accumulate 适用于您在输入范围内迭代并构建在该范围内的单个结果的任何时候。但是结果不需要与范围的类型相同，并且您不能对要使用的初始值做出任何假设——这就是为什么您必须有一个初始实例来用作累加结果的原因。

【讨论】：

这一点应该进一步强调。就在前几天，我发现了一个令人讨厌的错误，我想对向量执行 float 操作，但因为我只是使用 0 作为我的初始值，所以编译器进行了所有 int 类型的操作。跨度>
在auto accumulate_func = [](int accumulator, const Employee& emp) 中缺少一个左大括号。

【解决方案2】：

事情就是这样，对于确定范围不为空并且想要从范围的第一个元素开始累积的代码来说，这很烦人。根据用于累积的操作，使用的“零”值并不总是很明显。

另一方面，如果您只提供一个需要非空范围的版本，那么对于不确定他们的范围是否为空的调用者来说，这很烦人。给他们增加了额外的负担。

一种观点认为，两全其美当然是同时提供两种功能。例如，Haskell 同时提供foldl1 和foldr1（需要非空列表）以及foldl 和foldr（镜像std::transform）。

另一种观点是，由于一个可以通过一个简单的转换根据另一个实现（正如您所展示的：std::transform(std::next(b), e, *b, f) -- std::next 是 C++11，但这一点仍然存在），它最好在不损失表达能力的情况下使界面尽可能小。

【讨论】：

【解决方案3】：

因为标准库算法应该适用于任意范围的（兼容）迭代器。所以accumulate 的第一个参数不一定是begin()，它可以是begin() 和end() 之前的任何迭代器。它也可以使用反向迭代器。

整个想法是将算法与数据分离。如果我理解正确，您的建议需要数据中的特定结构。

【讨论】：

【解决方案4】：

如果你想要accumulate(V.begin()+1, V.end(), V.begin())，你可以写那个。但是如果你认为 v.begin() 可能是 v.end() （即 v 是空的）怎么办？如果v.begin() + 1没有实现怎么办（因为v只实现了++，没有泛型加法）？如果累加器的类型不是元素的类型怎么办？例如。

std::accumulate(v.begin(), v.end(), 0, [](long count, char c){
   return isalpha(c) ? count + 1 : count
});

【讨论】：

v.begin() + 1 在 C++11 中通过引入 std::next 得到解决。另一方面，后一点（取消引用的迭代器和累积值之间的类型差异）是我迄今为止看到的最有说服力的论点。

【解决方案5】：

确实不需要。我们的代码库有 2 和 3 参数重载，它们使用 T{} 值。

但是，std::accumulate 已经很老了；它来自原始的 STL。我们的代码库有花哨的std::enable_if 逻辑来区分“2 个迭代器和初始值”和“2 个迭代器和归约运算符”。这需要 C++11。我们的代码还使用尾随返回类型 (auto accumulate(...) -> ...) 来计算返回类型，这是 C++11 的另一个特性。

【讨论】：