【问题标题】:Keras CNN model predicts the same value for all inputs and does not increase accuracy during trainingKeras CNN 模型为所有输入预测相同的值,并且在训练期间不会提高准确性
【发布时间】:2020-03-25 12:01:19
【问题描述】:

我正在尝试关注 Nvidia 自动驾驶 CNN 论文 (https://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf)。但是,当我运行代码时,我的准确性在训练期间保持不变,并且损失非常小。该模型还预测任何输入的相同值,非常接近 0。预期输出主要在 -4 和 +4 之间。

import tensorflow.compat.v1 as tf
import scipy.misc
import random
from tensorflow import keras
from tensorflow.keras import datasets, layers, models
import numpy as np

model = models.Sequential()
model.add(layers.Conv2D(24, (5, 5), strides=(2, 2), input_shape=(66, 200, 3)))
model.add(layers.Activation('relu'))
model.add(layers.Conv2D(36, (5, 5), strides=(2, 2)))
model.add(layers.Activation('relu'))
model.add(layers.Conv2D(48, (5, 5), strides=(2, 2)))
model.add(layers.Activation('relu'))
model.add(layers.Conv2D(64, (3, 3), strides=(1, 1)))
model.add(layers.Activation('relu'))
model.add(layers.Conv2D(64, (3, 3), strides=(1, 1)))
model.add(layers.Activation('relu'))
model.add(layers.Flatten())
model.add(layers.Dense(1164))
model.add(layers.Activation('relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(100))
model.add(layers.Activation('relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(50))
model.add(layers.Activation('relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(10))
model.add(layers.Activation('relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(1))
model.add(layers.Activation('linear'))

model.compile(optimizer = 'adam', loss= 'mse', metrics=['accuracy'])
epochs = 30
batchSize = 100
xs, ys = LoadTrainSet()
print("train batch loaded")
x, y = LoadTestSet()
print("test batch loaded")

xs = np.array(xs)
x = np.array(x)
#ys = tf.math.l2_normalize(np.array(ys))
#y = tf.math.l2_normalize(np.array(y))
#(Suggestion by Simon)Replaced by:
epsilon = 1e-12
ys =  np.array(ys) / tf.math.sqrt(tf.math.reduce_mean(np.array(ys)**2), epsilon)
y =  np.array(y) / tf.math.sqrt(tf.math.reduce_mean(np.array(y)**2), epsilon)

history = model.fit(xs, ys, batch_size=batchSize, epochs=epochs)
testLoss, testAcc = model.evaluate(x, y, verbose=2)

培训:

Epoch 4/30
5001/5001 [==============================] - 9s 2ms/sample - loss: 1.9974e-04 - accuracy: 0.0382
Epoch 5/30
5001/5001 [==============================] - 9s 2ms/sample - loss: 2.0004e-04 - accuracy: 0.0382
Epoch 6/30
5001/5001 [==============================] - 8s 2ms/sample - loss: 2.0040e-04 - accuracy: 0.0382
Epoch 7/30
5001/5001 [==============================] - 8s 2ms/sample - loss: 1.9986e-04 - accuracy: 0.0382
Epoch 8/30
5001/5001 [==============================] - 8s 2ms/sample - loss: 2.0064e-04 - accuracy: 0.0382
Epoch 9/30
5001/5001 [==============================] - 8s 2ms/sample - loss: 2.0014e-04 - accuracy: 0.0382
Epoch 10/30
5001/5001 [==============================] - 8s 2ms/sample - loss: 1.9993e-04 - accuracy: 0.0382

预测:

MODEL PREDICTIONS:
[[0.00018978]
 [0.00018978]
 [0.00018978]
 [0.00018978]
 [0.00018978]
 [0.00018978]
 [0.00018978]
 [0.00018978]
 [0.00018978]
 [0.00018978]]
ACTUAL VALUES:
[[ 0.01337768]
 [-0.00774151]
 [-0.00143646]
 [ 0.        ]
 [ 0.00287291]
 [-0.00287291]
 [ 0.        ]
 [-0.00199569]
 [ 0.02122884]
 [ 0.01083373]]

第一次发帖,如有错误请见谅。任何帮助将不胜感激。

【问题讨论】:

    标签: python tensorflow neural-network conv-neural-network


    【解决方案1】:

    由于您的标签相当小(

    或者,您可以尝试使用平均绝对误差,因为对于小输入,梯度不会变小。

    编辑:再看一遍后,我似乎错过了您实际上使用 tf.math.l2_normalize 规范化数据。 但我认为你需要向它传递一个轴参数(看起来你正在使用被视为张量的整个标签集的范数进行归一化)。使用该方法你除以平方根原始标签的平方和。相反,您应该除以平方平均值的平方根。

    【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-04-21
    • 2018-11-22
    • 2021-08-02
    • 2018-09-18
    • 2017-10-09
    • 2019-02-20
    • 2018-06-25
    相关资源
    最近更新 更多