【问题标题】:How to increase accuracy of network running on MNIST如何提高在 MNIST 上运行的网络的准确性
【发布时间】:2019-04-04 19:24:29
【问题描述】:

我遵循了以下代码: https://github.com/HyTruongSon/Neural-Network-MNIST-CPP

这很容易理解。它产生 94% 的准确率。我必须将其转换为具有更深层次的网络(范围从 5 到 10)。为了让自己舒服,我只多加了一层。但是,无论我训练多少,准确率都不会超过 50%。我在每个隐藏层中添加了 256 个神经元。 这是我修改代码的方式: 我像这样添加了额外的层:

// From layer 1 to layer 2. Or: Input layer - Hidden layer
double *w1[n1 + 1], *delta1[n1 + 1], *out1;

// From layer 2 to layer 3. Or; Hidden layer - 2Hidden layer
double *w2[n2 + 1], *delta2[n2 + 1], *in2, *out2, *theta2;

// From layer 3 to layer 4. Or; Hidden layer - Output layer
double *w3[n3 + 1], *delta3[n3 + 1], *in3, *out3, *theta3;

// Layer 3 - Output layer
double *in4, *out4, *theta4;
double expected[n4 + 1];

前馈部分是这样修改的:

void perceptron() {
    for (int i = 1; i <= n2; ++i) {
        in2[i] = 0.0;
    }

    for (int i = 1; i <= n3; ++i) {
        in3[i] = 0.0;
    }
    for (int i = 1; i <= n4; ++i) {
        in4[i] = 0.0;
    }

    for (int i = 1; i <= n1; ++i) {
        for (int j = 1; j <= n2; ++j) {
            in2[j] += out1[i] * w1[i][j];
        }
    }

    for (int i = 1; i <= n2; ++i) {
        out2[i] = sigmoid(in2[i]);
    }

  /////
     for (int i = 1; i <= n2; ++i) {
        for (int j = 1; j <= n3; ++j) {
            in3[j] += out2[i] * w2[i][j];
        }
    }

    for (int i = 1; i <= 3; ++i) {
        out3[i] = sigmoid(in3[i]);
    }

  ////
    for (int i = 1; i <= n3; ++i) {
        for (int j = 1; j <= n4; ++j) {
            in4[j] += out3[i] * w3[i][j];
        }
    }

    for (int i = 1; i <= n4; ++i) {
        out4[i] = sigmoid(in4[i]);
    }
}

反向传播是这样改变的:

void back_propagation() {
    double sum;

    for (int i = 1; i <= n4; ++i) {
        theta4[i] = out4[i] * (1 - out4[i]) * (expected[i] - out4[i]);
    }

    for (int i = 1; i <= n3; ++i) {
        sum = 0.0;
        for (int j = 1; j <= n4; ++j) {
            sum += w3[i][j] * theta4[j];
        }
        theta3[i] = out3[i] * (1 - out3[i]) * sum;
    }

    for (int i = 1; i <= n3; ++i) {
        for (int j = 1; j <= n4; ++j) {
            delta3[i][j] = (learning_rate * theta4[j] * out3[i]) + (momentum * delta3[i][j]);
            w3[i][j] += delta3[i][j];
        }
    }

    /////////////

       for (int i = 1; i <= n2; ++i) {
        for (int j = 1; j <= n3; ++j) {
            delta2[i][j] = (learning_rate * theta3[j] * out2[i]) + (momentum * delta2[i][j]);
            w2[i][j] += delta2[i][j];
        }
    }
   /////////////////

    for (int i = 1; i <= n1; ++i) {
        for (int j = 1 ; j <= n2 ; j++ ) {
            delta1[i][j] = (learning_rate * theta2[j] * out1[i]) + (momentum * delta1[i][j]);
            w1[i][j] += delta1[i][j];
        }
    }
}

我也发布了我的修改,因为我可能在这里的某个地方错了。一旦我将 epochs 变量设置为 1000 并让它训练 24 小时,仍然没有进展:(。我对此感到非常沮丧,我不知道我可能错在哪里。

【问题讨论】:

    标签: c++ neural-network backpropagation gradient-descent multi-layer


    【解决方案1】:

    您是否忘记将反向传播添加到从第 3 层到第 2 层的 thetha2 参数?

    for (int i = 1; i <= n2; ++i) {
        sum = 0.0;
        for (int j = 1; j <= n3; ++j) {
           sum += w2[i][j] * theta3[j];
        }
        theta2[i] = out2[i] * (1 - out2[i]) * sum;
    }
    

    【讨论】:

    • 我不敢相信我犯了这么幼稚的错误。让我再训练一次。希望它运行:)
    • 希望它能给你一个不错的新元:)。如果它有效,请记得投票和/或接受答案。
    • 您认为如果我在最后一层将 sigmoid 更改为 softmax,它是否也有助于提高准确性?我应该采取哪些其他措施
    猜你喜欢
    • 2018-01-10
    • 2017-09-06
    • 1970-01-01
    • 2020-12-09
    • 1970-01-01
    • 2016-08-02
    • 2017-08-05
    • 1970-01-01
    相关资源
    最近更新 更多