为什么我们需要使用 feed_dict 传递值以在 TensorFlow 中打印损失值答案

【问题标题】：Why we need to pass values using feed_dict to print loss value in TensorFlow为什么我们需要使用 feed_dict 传递值以在 TensorFlow 中打印损失值
【发布时间】：2018-07-18 17:16:08
【问题描述】：

下面是小Tensorflow代码

# coding: utf-8

# In[27]:

import tensorflow as tf


# In[28]:

# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)


# In[29]:

# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W * x + b


# In[30]:

y = tf.placeholder(tf.float32)


# In[31]:

# loss
loss = tf.reduce_sum(tf.square(linear_model - y))

# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)


# In[32]:

# training data
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]


# In[33]:

# training loop
init = tf.global_variables_initializer()


# In[34]:

with tf.Session() as sess:
  sess.run(init)

  for i in range(1000):
    sess.run(train, {x: x_train, y: y_train})

  # evaluate training accuracy
  curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})

  print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))


# In[ ]:

在for循环中我们有以下代码

with tf.Session() as sess:
  sess.run(init)

  for i in range(1000):
    sess.run(train, {x: x_train, y: y_train})

  # evaluate training accuracy
  curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})

  print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))

我的问题是当我们运行 sess.run(train, {x: x_train, y: y_train}) 时，loss 也会被计算出来，那么为什么我们需要在想要检索如下损失值时传递 feed_dict 呢？谁能帮我理解这个？

 curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})

【问题讨论】：

标签： tensorflow machine-learning deep-learning

【解决方案1】：

您在代码中定义了 2 个占位符：x 和 y。 tf.placeholder 是一个容器，可以在程序的每次执行中提供不同的值。

当您使用tf.placeholder 时，TensorFlow 在内部使用此容器（占位符）定义其计算图。 sess.run() 运行此计算图，但该图本身没有意义，因为占位符容器是空的 - 它们不包含任何内容。因此，无论何时在代码中使用占位符，都需要使用 sess.run() 的 feed_dict 参数在图形中传递这些占位符的值。

占位符的优点是您在一次执行sess.run() 时放入其中的值不会被记住。也就是说，sess.run() 的第二次调用将再次具有空占位符，并且您将再次必须通过 feed_dict 将值放入其中。这就是为什么您必须在每次调用 sess.run() 时为占位符发送值。

一个有用的类比可能是将您的 TensorFlow 计算图视为一台物理机器——具有输入管道（x 和 y）和输出管道（loss）。机器使用来自输入管道的数据（因此数据不会在多次调用中保留），并且机器还会从输出管道中吐出一些东西——如果你没有捕捉到输出，你就会丢失它。机器（图表）不存储任何值或结果。它仅用于定义对数据应用不同操作的工作流。

像train 这样的操作是机器的杠杆，当拉动它时做机器内的某事。现在为了让机器做任何工作，你必须在输入管道中放一些东西。当您调用sess.run(train) 时，机器用完占位符中的数据，计算损失（它通过loss 输出管道发送，您没有捕获）并通过反向传播修改其内部变量。现在输入管道又是空的，loss 的旧值丢失了！因此，当你想计算loss时，你将数据放入输入管道中，并要求机器通过loss管道输出loss。

您可能很想这样做：

loss_value, _ = sess.run([loss, train], {x: x_train, y: y_train})

但不幸的是，TensorFlow 将no guarantees 提供给order，其中sess.run() 评估其操作。因此，在上面的代码行中，您将不知道返回的loss_value 是运行训练操作之前还是之后的损失。这样做的唯一方法是首先运行训练操作，然后像您在代码中所做的那样，在对 sess.run() 的 2 次单独调用中运行损失操作。

【讨论】：

【解决方案2】：

使用y 和linear_model 评估loss。
请注意：

y 是占位符，并且
linear_model的计算需要占位符x

因此，一旦有了占位符，就必须使用 feed_dict 传入数据。

顺便说一句，运行sess.run(train, {x: x_train, y: y_train}) 会调用梯度下降优化损失函数。

while running curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train}) 用于打印出在执行 train 操作 train 后已经优化的 loss 的当前值。

【讨论】：