一次迭代更改训练数据后精度下降答案

【问题标题】：Accuracy drops after changing training data with one iteration一次迭代更改训练数据后精度下降
【发布时间】：2017-02-09 13:48:57
【问题描述】：

我一直致力于在 tensorflow 中使用 LeNet 来训练和分类德国交通标志。我已经修改了 LeNet 第一层和最后一层，以接受 1 和 3 通道彩色图像（第 1 层）和类数为 43 （第 6 层）。

from tensorflow.contrib.layers import flatten

def LeNet(x, inputdepth):
    # Hyperparameters
    mu = 0
    sigma = 0.1

# Solution: Layer 1: Convolutional input 32x32x3. Output = 28x28x6
conv1_W = tf.Variable(tf.truncated_normal(shape=(5,5,inputdepth,6), mean=mu, stddev=sigma))
conv1_b = tf.Variable(tf.zeros(6))
conv1 = tf.nn.conv2d(x, conv1_W, strides = [1,1,1,1], padding='VALID') + conv1_b

# Solution: Activation
conv1 = tf.nn.relu(conv1)

# Solution: Pooling. INput = 28x28x6. Output = 14x14x6
conv1 = tf.nn.max_pool(conv1, ksize=[1,2,2,1], strides=[1,2,2,1], padding='VALID')

# Solution: Layer 2: Convolutional Output = 10x10x16
conv2_W = tf.Variable(tf.truncated_normal(shape = (5,5,6,16), mean=mu, stddev=sigma))
conv2_b = tf.Variable(tf.zeros(16))
conv2 = tf.nn.conv2d(conv1, conv2_W, strides = [1,1,1,1], padding='VALID') + conv2_b

# Solution: Activation
conv2 = tf.nn.relu(conv2)

# Solution: Pooling. Input = 10x10x16. Output = 5x5x16
conv2 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides = [1,2,2,1], padding='VALID')

# Solution: Flatten. Input = 5x5x16. Output = 400
fc0 = flatten(conv2)

# Solution: Layer 3: Full Connected. Input = 400, Output = 120
fc1_W = tf.Variable(tf.truncated_normal(shape=(400,120), mean = mu, stddev=sigma))
fc1_b = tf.Variable(tf.zeros(120))
fc1 = tf.matmul(fc0, fc1_W) + fc1_b

# Solution: Activation
fc1 = tf.nn.relu(fc1)

# Solution: Layer 4: Fully Connected. Input = 120, Output = 84
fc2_W = tf.Variable(tf.truncated_normal(shape = (120,84), mean=mu, stddev=sigma))
fc2_b = tf.Variable(tf.zeros(84))
fc2 = tf.matmul(fc1,fc2_W) + fc2_b

# Solution: Activation
fc2 = tf.nn.relu(fc2)

# Solution: Layer 5: Fully Connected. Input = 84, Output = 43
fc3_W = tf.Variable(tf.truncated_normal(shape=(84,43), mean=mu, stddev=sigma))
fc3_b = tf.Variable(tf.zeros(43))
logits = tf.matmul(fc2, fc3_W) + fc3_b

return logits

由于网络配置为同时接受 1 和 3 通道图像，（使用 depth 参数，我正在尝试使用各种预处理方法（灰度转换、(0,1) 之间的归一化和 [-0.5 ,0.5]) 在输入的训练图像上并尝试评估每一步的准确性。我有 6 种处理后的数据

RGB 原图
转换为灰度
灰度图像在[0,1]之间的归一化
具有零均值和单位方差 [-0.5,0.5] 的灰度图像缩放
在 [0,1] 之间对 RGB 图像进行归一化
在 RGB 上缩放，均值和单位方差为零 [-0.5,0.5]

我想在循环中创建一个管道，在一次迭代中获取一种类型的预处理数据并执行训练和验证。我的代码如下

inputData = [
             ('RGB',X_train, X_valid),
             ('RGBNormalized', normalizedRGB_train, normalizedRGB_valid),
             ('ScaledRGB', scaledRGB_train, scaledRGB_valid),
             ('Grayscale',grayimage_train, grayimage_valid),
             ('GrayScaleNormalized',normalizedGray_train, normalizedGray_valid), 
             ('GrayScaleScaled',scaledGray_train, scaledGray_valid) 
            ]

输入数据是一个元组列表，其中每个元组中，elem[0] 代表名称，elem[1] 代表训练集，elem[2] 代表验证集。现在我的管道如下

def evaluate(X_data, y_data):
    num_examples = len(X_data)
    total_accuracy = 0
    sess = tf.get_default_session()
    for offset in range(0,num_examples, BATCH_SIZE):
        batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
        accuracy = sess.run(accuracy_operation, feed_dict = {x:batch_x, y:batch_y})
        total_accuracy += (accuracy * len(batch_x))
    return total_accuracy / num_examples

import tensorflow as tf
from sklearn.utils import shuffle


# Simulation Control Parameters
EPOCHS = 10
BATCH_SIZE = 128
rate = 0.0001

# Variable to store the accuracy of the model
model_performance = np.zeros((len(inputData),EPOCHS))
modelIndex = 0


for name,trainingData, validationData in inputData:
    if np.shape(trainingData)[-1] == 3:
        depth = 3
    else:
        depth = 1

    # Create tensors for input data
    x = tf.placeholder(tf.float32, (None, 32, 32,depth))
    y = tf.placeholder(tf.int32, (None))
    one_hot_y = tf.one_hot(y,43)

    # Tensor Operations
    logits = LeNet(x,depth)
    cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,one_hot_y)
    loss_operation = tf.reduce_mean(cross_entropy)
    optimizer = tf.train.AdamOptimizer(learning_rate=rate)
    training_operation = optimizer.minimize(loss_operation)
    correct_prediction = tf.equal(tf.argmax(logits,1), tf.argmax(one_hot_y,1))
    accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


    # Pipeline for training and evaluation
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        num_examples = len(X_train)

        print("Training on...",name,'data', 'with input size of',np.shape(trainingData))
        print()
        for i in range(EPOCHS):
            X_train, y_train = shuffle(trainingData, y_train)
            for offset in range(0, num_examples, BATCH_SIZE):
                end = offset + BATCH_SIZE
                batch_x, batch_y = X_train[offset:end], y_train[offset:end]
                sess.run(training_operation, feed_dict = {x: batch_x, y: batch_y})

            validation_accuracy = evaluate(validationData, y_valid)
            print("EPOCH {} ...".format(i+1))
            print("Validation Accuracy = {:.3f}".format(validation_accuracy))
            print()
            model_performance[modelIndex][i] = validation_accuracy

        modelIndex = modelIndex + 1

    sess.close()

如果我尝试在不进行任何预处理的情况下使用输入数据训练网络，则准确度范围在 80-90% 之间。然而，将网络保持在循环下会显示出滴管精度的奇怪行为，如下所示

    Training on... RGB data with input size of (34799, 32, 32, 3)

EPOCH 1 ...
Training Accuracy = 0.038
Validation Accuracy = 0.598

EPOCH 2 ...
Training Accuracy = 0.057
Validation Accuracy = 0.055

EPOCH 3 ...
Training Accuracy = 0.057
Validation Accuracy = 0.055

EPOCH 4 ...
Training Accuracy = 0.058
Validation Accuracy = 0.054

EPOCH 5 ...
Training Accuracy = 0.057
Validation Accuracy = 0.054

Training on... RGBNormalized data with input size of (34799, 32, 32, 3)

EPOCH 1 ...
Training Accuracy = 0.054
Validation Accuracy = 0.042

EPOCH 2 ...
Training Accuracy = 0.047
Validation Accuracy = 0.049

EPOCH 3 ...
Training Accuracy = 0.054
Validation Accuracy = 0.048

EPOCH 4 ...
Training Accuracy = 0.057
Validation Accuracy = 0.054

EPOCH 5 ...
Training Accuracy = 0.054
Validation Accuracy = 0.048

Training on... ScaledRGB data with input size of (34799, 32, 32, 3)

EPOCH 1 ...
Training Accuracy = 0.056
Validation Accuracy = 0.054

EPOCH 2 ...
Training Accuracy = 0.057
Validation Accuracy = 0.054

EPOCH 3 ...
Training Accuracy = 0.057
Validation Accuracy = 0.054

EPOCH 4 ...
Training Accuracy = 0.057
Validation Accuracy = 0.055

EPOCH 5 ...
Training Accuracy = 0.057
Validation Accuracy = 0.055
Training on... Grayscale data with input size of (34799, 32, 32, 1)

EPOCH 1 ...
Training Accuracy = 0.056
Validation Accuracy = 0.051

EPOCH 2 ...
Training Accuracy = 0.058
Validation Accuracy = 0.049

EPOCH 3 ...
Training Accuracy = 0.056
Validation Accuracy = 0.049

EPOCH 4 ...
Training Accuracy = 0.055
Validation Accuracy = 0.050

EPOCH 5 ...
Training Accuracy = 0.056
Validation Accuracy = 0.050
Training on... GrayScaleNormalized data with input size of (34799, 32, 32, 1)

EPOCH 1 ...
Training Accuracy = 0.055
Validation Accuracy = 0.074

EPOCH 2 ...
Training Accuracy = 0.057
Validation Accuracy = 0.054

EPOCH 3 ...
Training Accuracy = 0.056
Validation Accuracy = 0.061

EPOCH 4 ...
Training Accuracy = 0.057
Validation Accuracy = 0.055

EPOCH 5 ...
Training Accuracy = 0.057
Validation Accuracy = 0.054

Training on... GrayScaleScaled data with input size of (34799, 32, 32, 1)

EPOCH 1 ...
Training Accuracy = 0.055
Validation Accuracy = 0.049

EPOCH 2 ...
Training Accuracy = 0.056
Validation Accuracy = 0.060

EPOCH 3 ...
Training Accuracy = 0.058
Validation Accuracy = 0.054

EPOCH 4 ...
Training Accuracy = 0.056
Validation Accuracy = 0.062

EPOCH 5 ...
Training Accuracy = 0.056
Validation Accuracy = 0.061

知道我在哪里犯错了吗？

【问题讨论】：

你能发布同样的训练准确度的日志吗？
我修改了问题以包括每个数据 5 个 epoch 的训练和验证准确度
您的网络似乎根本没有训练。仅通过查看代码很难判断问题出在哪里，但通常 tensorflow 图和训练操作看起来是正确的，所以我建议问题可能出在您的数据上。检查您加载和随机播放数据的方式。

标签： python tensorflow

【解决方案1】：

一个原因可能是数据没有正确洗牌。因此，网络没有对图像进行适当的训练，并且与您使用的预处理技术无关。

【讨论】：