为什么在运行 TensorFlow 示例时收到大量警告消息？答案

【问题标题】：Why did I receive lots of warning message when running the TensorFlow example?为什么在运行 TensorFlow 示例时收到大量警告消息？
【发布时间】：2017-05-08 06:52:21
【问题描述】：

我正在关注教程：https://www.tensorflow.org/get_started/get_started

为什么我收到很多如下错误？此外，最终的损失分数是不同的。文档说：

{'global_step': 1000, 'loss': 1.9650059e-11}

而我的损失是：{'loss': 6.3995182e-09, 'global_step': 1000}

import tensorflow as tf
# NumPy is often used to load, manipulate and preprocess data.
import numpy as np

# Declare list of features. We only have one real-valued feature. There are many
# other types of columns that are more complicated and useful.
features = [tf.contrib.layers.real_valued_column("x", dimension=1)]

# An estimator is the front end to invoke training (fitting) and evaluation
# (inference). There are many predefined types like linear regression,
# logistic regression, linear classification, logistic classification, and
# many neural network classifiers and regressors. The following code
# provides an estimator that does linear regression.
estimator = tf.contrib.learn.LinearRegressor(feature_columns=features)

# TensorFlow provides many helper methods to read and set up data sets.
# Here we use `numpy_input_fn`. We have to tell the function how many batches
# of data (num_epochs) we want and how big each batch should be.
x = np.array([1., 2., 3., 4.])
y = np.array([0., -1., -2., -3.])
input_fn = tf.contrib.learn.io.numpy_input_fn({"x":x}, y, batch_size=4,
                                              num_epochs=1000)

# We can invoke 1000 training steps by invoking the `fit` method and passing the
# training data set.
estimator.fit(input_fn=input_fn, steps=1000)

# Here we evaluate how well our model did. In a real example, we would want
# to use a separate validation and testing data set to avoid overfitting.
print(estimator.evaluate(input_fn=input_fn))

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1555e351d0>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': None}
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpol66d18y
WARNING:tensorflow:Rank of input Tensor (1) should be the same as output_rank (2) for column. Will attempt to expand dims. It is highly recommended that you resize your input, as this behavior may change.
WARNING:tensorflow:From /home/abigail/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:615: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmpol66d18y/model.ckpt.
INFO:tensorflow:loss = 2.25, step = 1
INFO:tensorflow:global_step/sec: 2197.95
INFO:tensorflow:loss = 0.0537609, step = 101 (0.047 sec)
INFO:tensorflow:global_step/sec: 2106.83
INFO:tensorflow:loss = 0.0114769, step = 201 (0.047 sec)
INFO:tensorflow:global_step/sec: 2184.51
INFO:tensorflow:loss = 0.00149274, step = 301 (0.046 sec)
INFO:tensorflow:global_step/sec: 2126.71
INFO:tensorflow:loss = 0.000284785, step = 401 (0.047 sec)
INFO:tensorflow:global_step/sec: 2112.6
INFO:tensorflow:loss = 3.2641e-05, step = 501 (0.048 sec)
INFO:tensorflow:global_step/sec: 2048.21
INFO:tensorflow:loss = 3.71825e-06, step = 601 (0.048 sec)
INFO:tensorflow:global_step/sec: 2154.48
INFO:tensorflow:loss = 1.1719e-06, step = 701 (0.047 sec)
INFO:tensorflow:global_step/sec: 2287.71
INFO:tensorflow:loss = 1.42258e-07, step = 801 (0.043 sec)
INFO:tensorflow:global_step/sec: 3059.53
INFO:tensorflow:loss = 7.27343e-08, step = 901 (0.033 sec)
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmpol66d18y/model.ckpt.
INFO:tensorflow:Loss for final step: 6.50745e-09.
WARNING:tensorflow:Rank of input Tensor (1) should be the same as output_rank (2) for column. Will attempt to expand dims. It is highly recommended that you resize your input, as this behavior may change.
WARNING:tensorflow:From /home/abigail/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:615: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Starting evaluation at 2017-05-08-06:39:50
INFO:tensorflow:Restoring parameters from /tmp/tmpol66d18y/model.ckpt-1000
INFO:tensorflow:Finished evaluation at 2017-05-08-06:39:51
INFO:tensorflow:Saving dict for global step 1000: global_step = 1000, loss = 6.39952e-09
WARNING:tensorflow:Skipping summary for global_step, must be a float or np.float32.
{'loss': 6.3995182e-09, 'global_step': 1000}

【问题讨论】：

1.由于权重的随机初始化，最终损失可能会有所不同。因此，无需担心损失，因为它会随着步数的增加而减少。 2. 警告可能是因为您使用的 TF 版本。您可以尝试更新到最新的 TF 版本并再次运行它。（我正在使用 TF 1.0 并收到相同的警告）

标签： tensorflow jupyter

【解决方案1】：

已经明确提到您正在使用临时文件夹来存储模型。要解决此问题，您只需更改估算器语句。改变

estimator=tf.estimator.LinearRegressor(feature_columns = feature_columns)

到

estimator=tf.estimator.LinearRegressor(feature_columns = feature_columns, model_dir="D:\test") #Or any other directory as per your wish

而且答案并不准确，因为我们正在准备的训练模型在训练过程中会出现运行时错误。因此不要担心你从模型中得到的差异，而应该关注损失几乎为零。

【讨论】：

【解决方案2】：

这是按预期工作的。

由于随机初始化等问题，您不会期望每次运行程序时都会得到完全相同的数值损失。（如果需要确定性输出，可以尝试将 Tensorflow 图随机种子设置为固定值。）

警告和信息消息是良性的；我同意他们看起来有点吓人。您几乎总是可以忽略的信息消息。现在也忽略警告消息；我要求教程的作者更新它，以便他们离开。

希望有帮助！

【讨论】：

【解决方案3】：

这很烦人。您可以通过添加此行来抑制它：

tf.logging.set_verbosity(tf.logging.ERROR)

【讨论】：