交叉熵应用于神经网络中的反向传播答案

【问题标题】：Cross entropy applied to backpropagation in neural network交叉熵应用于神经网络中的反向传播
【发布时间】：2017-06-01 23:36:28
【问题描述】：

我在这里观看了 Dave Miller 关于用 C++ 从头开始制作神经网络的精彩视频：https://vimeo.com/19569529

以下是视频中引用的完整源代码：http://inkdrop.net/dave/docs/neural-net-tutorial.cpp

它使用均方误差作为成本函数。我对使用神经网络进行二元分类很感兴趣，因此想使用交叉熵作为成本函数。如果可能的话，我希望将其添加到此代码中，因为我已经在使用它了。

这将如何具体应用在这里？

唯一的区别是如何计算输出层的误差吗...或者方程式是否会在反向传播中一直发生变化？

有什么变化吗？ MSE 与交叉熵是否仅用于了解整体误差，与反向传播无关？

为清楚起见进行编辑：

以下是相关功能。

//output layer - seems like error is just target value minus calculated value
void Neuron::calcOutputGradients(double targetVal)
{
    double delta = targetVal - m_outputVal;
    m_gradient = delta * Neuron::transferFunctionDerivative(m_outputVal);
}

double Neuron::sumDOW(const Layer &nextLayer) const
{
    double sum = 0.0;

    // Sum our contributions of the errors at the nodes we feed.

    for (unsigned n = 0; n < nextLayer.size() - 1; ++n) {
        sum += m_outputWeights[n].weight * nextLayer[n].m_gradient;
    }

    return sum;
}

void Neuron::calcHiddenGradients(const Layer &nextLayer)
{
    double dow = sumDOW(nextLayer);
    m_gradient = dow * Neuron::transferFunctionDerivative(m_outputVal);
}


void Neuron::updateInputWeights(Layer &prevLayer)
{
    // The weights to be updated are in the Connection container in the neurons in the preceding layer

    for (unsigned n = 0; n < prevLayer.size(); ++n) {
        Neuron &neuron = prevLayer[n];
        double oldDeltaWeight = neuron.m_outputWeights[m_myIndex].deltaWeight;    

        //calculate new weight for neuron with momentum
        double newDeltaWeight = eta * neuron.getOutputVal() * m_gradient + alpha * oldDeltaWeight;

        neuron.m_outputWeights[m_myIndex].deltaWeight = newDeltaWeight;
        neuron.m_outputWeights[m_myIndex].weight += newDeltaWeight;
    }
}

【问题讨论】：

标签： c++ machine-learning neural-network backpropagation gradient-descent

【解决方案1】：

终于在这里找到了答案：https://visualstudiomagazine.com/articles/2014/04/01/neural-network-cross-entropy-error.aspx

您只需更改输出层的误差计算方式。

要改的相关函数是：

void Neuron::calcOutputGradients(double targetVal)

对于均方误差，请使用：

double delta = targetVal - m_outputVal;
m_gradient = delta * Neuron::transferFunctionDerivative(m_outputVal);

对于交叉熵只需使用：

m_gradient = targetVal - m_outputVal;

【讨论】：