Tensorflow 梯度下降错误：权重不变且成本设置为 1.0答案

【问题标题】：Tensorflow Gradient Descent bug: weights aren't changing and cost set to 1.0Tensorflow 梯度下降错误：权重不变且成本设置为 1.0
【发布时间】：2017-07-22 01:49:19
【问题描述】：

我试图构建一个卷积神经网络，但我偶然发现了一些非常奇怪的问题。

第一件事，这是我的代码：

import tensorflow as tf
import numpy as np
import matplotlib.image as mpimg
import glob

x = []
y = 1

for filename in glob.glob('trainig_data/*.jpg'):
    im = mpimg.imread(filename)
    x.append(im)
    if len(x) == 10:
        break
epochs = 5

weights = [tf.Variable(tf.random_normal([5,5,3,32],0.1)),
           tf.Variable(tf.random_normal([5,5,32,64],0.1)),
           tf.Variable(tf.random_normal([5,5,64,128],0.1)),
           tf.Variable(tf.random_normal([75*75*128,1064],0.1)),
           tf.Variable(tf.random_normal([1064,1],0.1))]

def CNN(x, weights):
    output = tf.nn.conv2d([x], weights[0], [1,1,1,1], 'SAME')
    output = tf.nn.relu(output)
    output = tf.nn.conv2d(output, weights[1], [1,2,2,1], 'SAME')
    output = tf.nn.relu(output)
    output = tf.nn.conv2d(output, weights[2], [1,2,2,1], 'SAME')
    output = tf.nn.relu(output)
    output = tf.reshape(output, [-1,75*75*128])
    output = tf.matmul(output, weights[3])
    output = tf.nn.relu(output)
    output = tf.matmul(output, weights[4])
    output = tf.reduce_sum(output)
    return output


sess = tf.Session()
prediction = CNN(tf.cast(x[0],tf.float32), weights)
cost = tf.reduce_mean(tf.square(prediction-y))
train = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
init = tf.global_variables_initializer()

sess.run(init)
for e in range(epochs):
    print('epoch:',e+1)
    for x_i in x:
        prediction = CNN(tf.cast(x_i,tf.float32), weights)
        sess.run([cost, train])
        print(sess.run(cost))
print('optimization finished!')
print(sess.run(prediction))

现在这是我的问题：

权重和过滤器的值没有改变
变量“成本”始终为 1.0
预测总是输出 0

在做了一些调试后发现问题一定出在优化器上，因为在我将权重放入优化器之前，成本和预测都不是 1.0 和 0。

我希望这是足够的信息，你可以帮助我解决我的问题。

附言。我已经尝试过使用tf.truncated_normal 而不是tf.random_normal

【问题讨论】：

输入以及您输入它们的方式显然有问题。在创建会话之前，您是否检查了 x 中的内容？
x 是一个用 numpy 数组填充的列表
我了解那部分，但您能告诉我们它的具体尺寸吗？您可能错误地将其传递到网络。
批次widthheight*channels 或者换句话说 10*300*300*3

标签： python tensorflow neural-network conv-neural-network gradient-descent

【解决方案1】：

我认为我的代码有问题。您需要定义占位符来提供您的输入，您没有任何占位符。您正在将常量值（第一张图像） x[0] 的张量流转换传递给模型。每次在每个时期调用 prediction = CNN(...) 时，您的代码都会定义一个新的 tensorflow 计算图。总的来说，你每次都在定义一个模型，给它一个恒定的图像。这是一个为我之前准备的 MNIST 图像定义 TensorFlow CNN 模型的链接： https://github.com/dipendra009/MNIST_TF-Slim/blob/master/MNIST_TensorFlow.ipynb .我希望它有所帮助。另外，请查看占位符的 TensorFlow 文档，这将有助于您更好地理解它。

【讨论】：

这是我改进后的训练代码：for e in range(epochs): print('epoch:',e+1) for x_i in x: sess.run([cost, train], feed_dict={x_p:x_i}) print(sess.run(cost, feed_dict={x_p:x_i})) print('optimization finished!') 但没有任何改变
您必须在图表中定义占位符以获取输入。你有没有看我分享的链接：github.com/dipendra009/MNIST_TF-Slim/blob/master/…
抱歉，我不太明白您所说的“在图中定义占位符”是什么意思
像这样定义占位符：train_data_node = tf.placeholder(tf.float32, shape=(BATCH_SIZE, IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS)) train_labels_node = tf.placeholder(tf.int64, shape= (BATCH_SIZE, )) eval_data = tf.placeholder(tf.float32, shape=(EVAL_BATCH_SIZE, IMAGE_SIZE, IMAGE_SIZE, NUM_CHANNELS))。然后使用它来定义您的模型，例如：logits = model(train_data_node, True)。然后将模型运行为：feed_dict = {train_data_node: batch_data, train_labels_node: batch_labels} _, l, lr, predictions = sess.run([optimizer, loss, learning_rate, train_prediction], feed_dict=feed_dict)。
这并没有改变什么