tf.estimator.Estimator 错误信息 IndexError: tuple index out of range答案

【问题标题】：tf.estimator.Estimator error message IndexError: tuple index out of rangetf.estimator.Estimator 错误信息 IndexError: tuple index out of range
【发布时间】：2017-11-18 00:47:34
【问题描述】：

我是 TensorFlow 和 Python 的新手。我正在尝试使用我自己的图像训练一个深度网络，将其用作简单的对象检测器和 Tensorflow，主要遵循 Tensorflow.org 上提供的教程。我的操作系统是 Mac OS X Sierra 10.12.6，我使用 Python 3.6 到 Anaconda 3。我已将图像写入训练和验证 tf.records 文件，并使用以下文件读取器和输入读取和批处理它们管道：

def read_file(filename_queue):
    reader = tf.TFRecordReader()
    key, record_string = reader.read(filename_queue)
    feature = {'image': tf.FixedLenFeature([], tf.string),
           'label': tf.FixedLenFeature([], tf.int64)}
    features = tf.parse_single_example(record_string, feature)
    image = tf.decode_raw(features['image'], tf.float32)
    image = tf.reshape(image, [224, 224, 1])
    image.set_shape([224, 224, 1])
    image = tf.cast(image, tf.float32) * (1 / 255.0)
    label = tf.cast(features['label'], tf.float32)
    label = tf.reshape(label, [1,])
    return image, label


def input_pipeline(filenames, batch_size, read_threads, num_epochs):
    print ('input pipeline ready')
    filename_queue = tf.train.string_input_producer(  
        [filenames], num_epochs=num_epochs, shuffle=True)  
    image, label = [read_file(filename_queue)
    for _ in range(read_threads)]
    min_after_dequeue = 10000
    capacity = min_after_dequeue + 3 * batch_size
    example_batch, label_batch = tf.train.shuffle_batch_join([image, label], 
        batch_size=batch_size, 
    capacity=capacity, min_after_dequeue=min_after_dequeue)
    print('loading batch')
    return example_batch, label_batch

我已经验证这可以正确读取和批处理我的输入文件和标签。然后，我按照“构建卷积神经网络”教程定义了一个卷积神经网络（根据需要为我的灰度图像更改它），我将其命名为 cnn_model_fn。训练和损失函数在 cnn_model_fn 中定义，如教程中所示。

我正在尝试使用 tf.estimator.Estimator 对象执行训练和验证，使用输入函数将批次加载到估计器中，代码如下：

def main(unused_argv):
# training images and labels
  example_batch, label_batch = input_pipeline(train_path, batch_size, 
    read_threads, num_epochs)
#validation images and labels
  Vexample_batch, Vlabel_batch = input_pipeline(val_path, batch_size, 
    read_threads, num_epochs)
  classifier = tf.estimator.Estimator(model_fn = cnn_model_fn, 
    model_dir=model_dir)
  tensors_to_log = {"probabilities": "softmax_tensor"}
  logging_hook = tf.train.LoggingTensorHook(tensors=tensors_to_log, 
    every_n_iter=batch_size)
  train_input_fn = tf.estimator.inputs.numpy_input_fn(
          x={"images": np.array(example_batch)},
          y=np.array(label_batch),
          batch_size= batch_size,
          num_epochs=num_epochs,
          shuffle=True)

  classifier.train(
      input_fn = train_input_fn,
      steps=int(label_batch.shape[0])/batch_size * num_epochs, hooks=
           [logging_hook])

  eval_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={"x": np.array(Vexample_batch)},
      y=np.array(Vlabel_batch),
      num_epochs=1,
      shuffle=False)
  metrics = {
      "accuracy":
          learn.MetricSpec(
                  metric_fn=tf.metrics.accuracy, prediction_key="classes")},

  eval_results = classifier.evaluate(input_fn=eval_input_fn, metrics = 
       metrics)
  print(eval_results)

if __name__ == "__main__":
     tf.app.run()

“classifier.train”命令导致以下错误消息：“IndexError: tuple index out of range”。我也尝试过不将图像和标签批次转换为 np.arrays，我收到以下错误消息： TypeError: unhashable type: 'Dimension' 问题末尾提供了对第一条错误消息的完整追溯。我也尝试过使用 tf.contrib.learn.estimator.fit，既具有上述输入功能，又可以直接输入批次，并且在使用该方法时遇到了类似的问题。我找不到关于这个特定问题的任何进一步信息，并且 Tensorflow.org 教程也没有进一步阐明这个问题。我觉得我可能错过了一些非常简单的东西，但我正在努力解决这个问题。任何帮助是极大的赞赏。这是完整的追溯：

File "<ipython-input-1-ee71d4ff521a>", line 168, in <module>
    tf.app.run()

   File "/Users/BAMF/anaconda3/lib/python3.6/site-
   packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))

  File "<ipython-input-1-ee71d4ff521a>", line 151, in main
    steps=int(label_batch.shape[0])/batch_size * num_epochs, hooks=
[logging_hook])

  File "/Users/BAMF/anaconda3/lib/python3.6/site-
packages/tensorflow/python/estimator/estimator.py", line 241, in train
    loss = self._train_model(input_fn=input_fn, hooks=hooks)

  File "/Users/BAMF/anaconda3/lib/python3.6/site-
packages/tensorflow/python/estimator/estimator.py", line 628, in _train_model
    input_fn, model_fn_lib.ModeKeys.TRAIN)

  File "/Users/BAMF/anaconda3/lib/python3.6/site-
packages/tensorflow/python/estimator/estimator.py", line 499, in 
_get_features_and_labels_from_input_fn
    result = self._call_input_fn(input_fn, mode)

  File "/Users/BAMF/anaconda3/lib/python3.6/site-
 packages/tensorflow/python/estimator/estimator.py", line 585, in 
   _call_input_fn
     return input_fn(**kwargs)

 File "/Users/BAMF/anaconda3/lib/python3.6/site-
 packages/tensorflow/python/estimator/inputs/numpy_io.py", line 109, in i 
   nput_fn
        if len(set(v.shape[0] for v in ordered_dict_x.values())) != 1:

  File "/Users/BAMF/anaconda3/lib/python3.6/site-
packages/tensorflow/python/estimator/inputs/numpy_io.py", line 109, in 
<genexpr>
    if len(set(v.shape[0] for v in ordered_dict_x.values())) != 1:

IndexError: tuple index out of range

【问题讨论】：

标签： image tensorflow conv-neural-network training-data

【解决方案1】：

classifier.train 函数需要 numpy 数组，但不是张量。因此，您需要通过使用会话评估它们来转换 example_batch, label batch，而不是使用 np.array() 函数包装它们。 (Explanation)

sess = tf.InteractiveSession()
tf.train.start_queue_runners(sess)


train_input_fn = tf.estimator.inputs.numpy_input_fn(
          x={"images": example_batch.eval()},
          y=label_batch.eval(),
          batch_size= batch_size,
          num_epochs=num_epochs,
          shuffle=True)

  classifier.train(
      input_fn = train_input_fn,
      steps=int(label_batch.shape[0])/batch_size * num_epochs, hooks=
           [logging_hook])

  eval_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={"x":Vexample_batch.eval()},
      y=Vlabel_batch.eval(),
      num_epochs=1,
      shuffle=False)

希望这会有所帮助。

【讨论】：

谢谢。我合并了这些更改，并在会话中添加了初始化所有变量，使用 tf.global_variables_initializer() 和 tf.local_variables_initializer()。这让 classifier.train() 步骤可以运行，但现在我在评估中遇到了问题。我收到错误消息“ValueError：无法挤压暗淡 [1]，预期尺寸为 1，'remove_squeezable_dimensions/Squeeze'（操作：'Squeeze'）得到 2，输入形状：[?,2]。”我尝试将标签的维度减少为向量，但没有帮助。任何想法是什么造成的？如果需要，可以提供完整的堆栈和代码。
我也对某事感到困惑；不是将所有训练和评估过程分组为“主”函数并使用“if name == "main":" 调用它来初始化一次所有变量并以紧凑的方式协调/停止线程？从我下面的示例来看，这似乎是使用 tf.estimator.modekeys 时的预期方法（我没有显示，但我是）。如果调用我的 main 函数，为什么我需要调用会话？我应该在我的主要功能中运行一个会话吗？这似乎是多余的。任何澄清都会有所帮助。谢谢。