【发布时间】:2018-09-17 05:41:57
【问题描述】:
我已经使用 sigmoid 传递函数建立了一个非常简单的多层感知器,它具有一个隐藏层,并使用 2 个输入模拟数据。
我尝试使用 Github 上的 Simple Feedforward Neural Network using TensorFlow 示例进行设置。我不会在这里发布整个内容,但我的成本函数是这样设置的:
# Backward propagation
loss = tensorflow.losses.mean_squared_error(labels=y, predictions=yhat)
cost = tensorflow.reduce_mean(loss, name='cost')
updates = tensorflow.train.GradientDescentOptimizer(0.01).minimize(cost)
然后我简单地遍历一堆时期,目的是通过updates 操作在每一步优化我的权重:
with tensorflow.Session() as sess:
init = tensorflow.global_variables_initializer()
sess.run(init)
for epoch in range(10):
# Train with each example
for i in range(len(train_X)):
feed_dict = {X: train_X[i: i + 1], y: train_y[i: i + 1]}
res = sess.run([updates, loss], feed_dict)
print "epoch {}, step {}. w_1: {}, loss: {}".format(epoch, i, w_1.eval(), res[1])
train_result = sess.run(predict, feed_dict={X: train_X, y: train_y})
train_errors = abs((train_y - train_result) / train_y)
train_mean_error = numpy.mean(train_errors, axis=1)
test_result = sess.run(predict, feed_dict={X: test_X, y: test_y})
test_errors = abs((test_y - test_result) / test_y)
test_mean_error = numpy.mean(test_errors, axis=1)
print("Epoch = %d, train error = %.5f%%, test error = %.5f%%"
% (epoch, 100. * train_mean_error[0], 100. * test_mean_error[0]))
sess.close()
我希望这个程序的输出显示,在每个时期和每一步,权重都会更新,loss 的值会随着时间的推移而大幅下降。
但是,虽然我看到损失值和错误在减少,但权重仅在第一步之后发生变化,然后在程序的其余部分保持不变。
这是怎么回事?
这是前 2 个时期打印到屏幕上的内容:
epoch 0, step 0. w_1: [[0. 0.]
[0. 0.]], loss: 492.525634766
epoch 0, step 1. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 482.724365234
epoch 0, step 2. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 454.100799561
epoch 0, step 3. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 418.499267578
epoch 0, step 4. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 387.509033203
Epoch = 0, train error = 84.78731%, test error = 88.31780%
epoch 1, step 0. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 355.381134033
epoch 1, step 1. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 327.519226074
epoch 1, step 2. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 301.841705322
epoch 1, step 3. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 278.177368164
epoch 1, step 4. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 257.852508545
Epoch = 1, train error = 69.24779%, test error = 76.38461%
除了不变之外,每行的权重具有相同的值也很有趣。损失本身不断减少。这是最后一个纪元的样子:
epoch 9, step 0. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 13.5048065186
epoch 9, step 1. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 12.4460296631
epoch 9, step 2. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 11.4702644348
epoch 9, step 3. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 10.5709943771
epoch 9, step 4. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], loss: 10.0332946777
Epoch = 9, train error = 13.49328%, test error = 33.56935%
我在这里做错了什么?我知道权重正在某处更新,因为我可以看到训练和测试错误发生变化,但为什么我看不到呢?
编辑:根据squadrick的要求,这里是w_1和y_hat的代码:
# Layer's sizes
x_size = train_X.shape[1] # Number of input nodes
y_size = train_y.shape[1] # Number of outcomes
# Symbols
X = tensorflow.placeholder("float", shape=[None, x_size], name='X')
y = tensorflow.placeholder("float", shape=[None, y_size], name='y')
# Weight initializations
w_1 = tensorflow.Variable(tensorflow.zeros((x_size, x_size)))
w_2 = tensorflow.Variable(tensorflow.zeros((x_size, y_size)))
# Forward propagation
h = tensorflow.nn.sigmoid(tensorflow.matmul(X, w_1))
yhat = tensorflow.matmul(h, w_2)
EDIT2: squadrick 的建议看w_2 很有趣;当我使用以下内容将w_2 添加到打印语句时;
print "epoch {}, step {}. w_1: {}, w_2: {}, loss: {}".format(epoch, i, w_1.eval(), w_2.eval(), res[1])
我看到它确实更新了;
epoch 0, step 0. w_1: [[0. 0.]
[0. 0.]], w_2: [[0.22192918]
[0.22192918]], loss: 492.525634766
epoch 0, step 1. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], w_2: [[0.44163907]
[0.44163907]], loss: 482.724365234
epoch 0, step 2. w_1: [[0.5410637 0.5410637]
[0.5803371 0.5803371]], w_2: [[0.8678319]
[0.8678319]], loss: 454.100799561
所以现在看来问题是只有 w_2 正在更新,而不是 w_1。我仍然不确定为什么会发生这种情况。
【问题讨论】:
-
你能把你创建
w1的代码和计算yhat的代码贴出来吗? -
@squadrick 我已将这些附加到帖子的末尾。
-
打印出
w_2,看看是否会随着时间的推移而改变 -
@squadrick 你是对的 -
w_2正在更新,但w_1没有。你知道为什么会这样吗?我已经用新的打印语句更新了问题。
标签: python tensorflow