神经网络反向传播问题答案

【问题标题】：NeuralNetwork back propagation question神经网络反向传播问题
【发布时间】：2011-10-28 12:02:40
【问题描述】：

在阅读了很多其他人的神经网络代码后，我确信我的代码有问题。它有效，我可以训练一个网络，只是为了训练隐藏层中的下一个感知器，我必须训练最后一个，我不应该能够并行训练隐藏层中的所有单元吗？

下面是它计算隐藏层误差的代码：

    for(int i=n->numOfPerceptronLayers-2;i>=1;i--) { // for all hidden layers
        float sum = 0.0; // <- This here is the problem
        for(int j=0;j<n->perceptronLayers[i].numOfPerceptrons;j++) { // For all the units in the current hidden layer
            for(int k=0;k<n->perceptronLayers[i].perceptrons[j].numOfConnections;k++) { // Loop through the current units connections to the previous layer (output layer)
                sum += n->perceptronLayers[i+1].perceptrons[k].error * n->perceptronLayers[i+1].perceptrons[k].weights[j];
            }
            n->perceptronLayers[i].perceptrons[j].error = n->perceptronLayers[i].perceptrons[j].output * (1.0 - n->perceptronLayers[i].perceptrons[j].output) * sum;
        }
    }

应该是这样的（但这不起作用）：

for(int i=n->numOfPerceptronLayers-2;i>=1;i--) { // for all hidden layers 
    for(int j=0;j<n->perceptronLayers[i].numOfPerceptrons;j++) { // For all the units in the current hidden layer
        float sum = 0.0;
        for(int k=0;k<n->perceptronLayers[i].perceptrons[j].numOfConnections;k++) { // Loop through the current units connections to the previous layer (output layer)
                sum += n->perceptronLayers[i+1].perceptrons[k].error * n->perceptronLayers[i+1].perceptrons[k].weights[j];
        }
        n->perceptronLayers[i].perceptrons[j].error = n->perceptronLayers[i].perceptrons[j].output * (1.0 - n->perceptronLayers[i].perceptrons[j].output) * sum;
    }
}

为什么必须为整个层而不是单个感知器声明 sum 变量？

【问题讨论】：

您能否更具体地说明“它不起作用”的含义。您能否添加您尝试编码的确切数学公式，因为我觉得这里的翻译可能会丢失一些东西。
数学公式是反向传播学习，我试图计算隐藏层中感知器的误差。它不起作用意味着训练过程不起作用，网络永远不会学习第二段代码中的假设。

标签： c math parallel-processing neural-network backpropagation

【解决方案1】：

除非我遗漏了什么，否则我相信 first 代码段是错误的，而后一段是正确的。

在第一个代码段中，您对整个层使用单个“sum”变量会导致错误在每个后续处理的感知器中累积。因此，感知器 j 总是比感知器 j-1 有更多的错误。

后一个代码解决了这个问题，但你说它是行不通的。唯一合理的结论是，真正的问题出在代码的其他地方，因为第一个代码段不应该工作。

除此之外：您确实应该能够并行训练所有层的感知器，因为每个感知器仅依赖于它的前向连接来分担错误（在标准的前馈反向传播中）。

【讨论】：

hmmmm... 我注意到的一件事是，如果我在计算 sum 之后添加 sum *= j+1 ，它的工作原理就很奇怪。您知道哪个部分会导致问题吗？我知道您没有我的代码，但它很可能是更新权重的片段吧？

【解决方案2】：

我似乎找到了问题，基本上我训练单个感知器的 TrainPerceptron(Perceptron* p, float error, float moment) 函数通过参数给出了感知器的错误，即使感知器结构具有错误属性.我将错误属性传递给函数，但我猜有些事情搞混了，因为在我删除了该参数并使用存储在 Perceptron 结构中的错误之后，它起作用了。

【讨论】：

我很高兴你找到了它，虽然我感觉你指望“它工作”来告诉你你是否犯了错误，我只是想警告你，这是解决数学问题的坏方法像这样。您应该逐步验证您的代码是否完全遵循公式。即使它是错误的，它也可能仍然“有效”。你可能只是在创建一个新的网络架构来学习一些东西，但如果你不自觉地这样做，你将来一定会非常困惑。你应该先检查你的代码而不是测试它，然后正确地测试它。
例如具有已知值的反向传播学习。使用另一个软件并使用相同的值运行它，以验证您的算法是否准确。