【问题标题】:Why my tensorflow example code training result is increasing?为什么我的 tensorflow 示例代码训练结果在增加?
【发布时间】:2017-02-01 08:29:30
【问题描述】:

您好,我正在学习 tensorflow。 这是我的代码,一个简单的多变量张量流示例。 运行环境为Python3.5.3、Tensorflow 0.12.1、Windows7。

import tensorflow as tf

# Input data & output data
x1_data = [1.0, 0.0, 3.0, 0.0, 5.0]
x2_data = [0.0, 2.0, 0.0, 4.0, 5.0]
y_data =  [1.0, 2.0, 3.0, 4.0, 5.0]

# W1, W2, b random generation
# W1 = 1, W2 = 1, b = 0 is ideal
W1 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
W2 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.random_uniform([1], -1.0, 1.0))

# Our hypothesis
hypothesis = W1 * x1_data + W2 * x2_data + b
# Simplified cost function
cost = tf.reduce_mean(tf.square(hypothesis - y_data))

# Minimize
a = tf.Variable(0.1) # Learning Rate
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)

# Initialise
init = tf.global_variables_initializer()

# Launch
sess = tf.Session()
sess.run(init)

# Train loop
for step in range(10):
    sess.run(train)
    print(step, sess.run(cost), sess.run(W1), sess.run(W2), sess.run(b))

我预计结果会随着训练循环而减少。

但它会无限增加。

一个变量上的相同代码运行良好,但它减少了。

我不知道为什么 2 变量会增加...

0 52.0504 [ 1.47101164] [ 2.24049234] [ 0.86718893]
1 157.129 [-1.74108529] [-1.84496927] [-0.22162986]
2 478.055 [ 4.02118969] [ 5.11457825] [ 1.86127353]
3 1457.33 [-5.99311352] [-7.13181305] [-1.60902405]
4 4445.18 [ 11.50830746] [ 14.20653534] [ 4.60829926]
5 13561.2 [-19.06884766] [-23.10119247] [-6.10722733]
6 41374.3 [ 34.32733154] [ 42.03698349] [ 12.74352837]
7 126232.0 [-58.95558929] [-71.76408386] [-20.05929375]
8 385134.0 [ 103.96767426] [ 126.9929657] [ 37.3527832]
9 1.17505e+06 [-180.62704468] [-220.19728088] [-62.82305145]

【问题讨论】:

    标签: python python-3.x tensorflow


    【解决方案1】:

    我发现的第一个解决方案是将学习率降低到 0.01。似乎这些步骤过于彻底地改变了您的参数。如果您使用某种正则化技术(如 L2),这可能不会发生。

    其次,您的代码需要一些改进。使用张量流矩阵运算并将偏差初始化为零。奇怪的是,当使用 TF 函数进行运算时,即使是 0.1 的学习率也可以。

    import tensorflow as tf
    import numpy as np
    
    # Input data & output data
    x1_data = [1.0, 0.0, 3.0, 0.0, 5.0]
    x2_data = [0.0, 2.0, 0.0, 4.0, 5.0]
    y_data =  [1.0, 2.0, 3.0, 4.0, 5.0]
    
    input_X = tf.Variable(np.row_stack((x1_data, x2_data)).astype(np.float32))
    W = tf.Variable(tf.random_uniform([1,2], -1.0, 1.0))
    b = tf.Variable(tf.zeros([1,1]))
    
    # Our hypothesis
    hypothesis = tf.add(tf.matmul(W,input_X),b)
    # Simplified cost function
    cost = tf.reduce_mean(tf.square(hypothesis - y_data))
    
    # Minimize
    a = tf.Variable(0.1) # Learning Rate
    optimizer = tf.train.GradientDescentOptimizer(a)
    train = optimizer.minimize(cost)
    
    # Initialise
    init = tf.global_variables_initializer()
    
    # Launch
    sess = tf.Session()
    sess.run(init)
    
    # Train loop
    for step in range(10):
        sess.run(train)
        print(step, sess.run(cost), sess.run(W), sess.run(b))
    

    【讨论】:

      猜你喜欢
      • 2018-02-02
      • 2021-10-03
      • 2019-01-05
      • 1970-01-01
      • 1970-01-01
      • 2019-12-31
      • 1970-01-01
      • 2018-12-23
      • 1970-01-01
      相关资源
      最近更新 更多