使用带有 skflow/tf 学习的 Tensorflow 输入管道答案

【问题标题】：Using a Tensorflow input pipeline with skflow/tf learn使用带有 skflow/tf 学习的 Tensorflow 输入管道
【发布时间】：2016-05-30 01:56:44
【问题描述】：

我已按照 Tensorflow Reading Data 指南以 TFRecords 的形式获取我的应用数据，并在我的输入管道中使用 TFRecordReader 来读取这些数据。

我现在正在阅读有关使用 skflow/tf.learn 构建简单回归器的指南，但我不知道如何通过这些工具使用我的输入数据。

在以下代码中，应用程序在调用 regressor.fit(..) 时失败，调用 ValueError: setting an array element with a sequence.。

错误：

Traceback (most recent call last):
  File ".../tf.py", line 138, in <module>
    run()
  File ".../tf.py", line 86, in run
    regressor.fit(x, labels)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/estimators/base.py", line 218, in fit
    self.batch_size)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 99, in setup_train_data_feeder
    return data_feeder_cls(X, y, n_classes, batch_size)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 191, in __init__
    self.X = check_array(X, dtype=x_dtype)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 161, in check_array
    array = np.array(array, dtype=dtype, order=None, copy=False)

ValueError: setting an array element with a sequence.

代码：

import tensorflow as tf
import tensorflow.contrib.learn as learn

def inputs():
    with tf.name_scope('input'):
        filename_queue = tf.train.string_input_producer([filename])

        reader = tf.TFRecordReader()
        _, serialized_example = reader.read(filename_queue)

        features = tf.parse_single_example(serialized_example, feature_spec)
        labels = features.pop('actual')
        some_feature = features['some_feature']

        features_batch, labels_batch = tf.train.shuffle_batch(
            [some_feature, labels], batch_size=batch_size, capacity=capacity,
            min_after_dequeue=min_after_dequeue)

        return features_batch, labels_batch


def run():
    with tf.Graph().as_default():
        x, labels = inputs()

        # regressor = learn.TensorFlowDNNRegressor(hidden_units=[10, 20, 10])
        regressor = learn.TensorFlowLinearRegressor()

        regressor.fit(x, labels)
        ...

看起来check_array 调用需要一个真正的数组，而不是张量。有什么办法可以将我的数据按摩成正确的形状吗？

【问题讨论】：

如果在 regressor.fit 调用之前执行 x = x.eval() 和 labels = labels.eval() 会发生什么？这应该将张量评估为一个数组，但我怀疑这是使用 skflow 执行此操作的正确方法...
@mathetes，这似乎可行，但在我走这条路之前，这是'tf-y'的做事方式吗？我的直觉是 TF 图应该移动数据，而不是我的程序。
当然，很抱歉我没有具体说明，但这只是作为一种调试方式。这就是为什么我发表评论而不是答案的原因。不过我帮不了你，我对 skflow 不熟悉

标签： python tensorflow skflow

【解决方案1】：

您使用的 API 似乎已贬值。如果您使用更现代的tf.contrib.learn.LinearRegressor（我认为>= 1.0），则应该指定input_fn，它基本上会产生输入和标签。我认为在您的示例中，这就像将您的 run 函数更改为：

def run():
    with tf.Graph().as_default():
        regressor = tf.contrib.learn.LinearRegressor()
        regressor.fit(input_fn=my_input_fn)

然后定义一个名为my_input_fn 的输入函数。从the docs 开始，这个输入函数采用以下形式：

def my_input_fn():

    # Preprocess your data here...

    # ...then return 1) a mapping of feature columns to Tensors with
    # the corresponding feature data, and 2) a Tensor containing labels
    return feature_cols, labels

我认为文档可以帮助您完成剩下的工作。从这里我很难说你应该如何在没有看到你的数据的情况下继续。

【讨论】：

你说得对，这是一个老问题，我现在已经解决了。感谢您提供有用的当前解决方案。