MNIST 数据设置批处理答案

【问题标题】：MNIST Data set up batchMNIST 数据设置批处理
【发布时间】：2019-04-09 16:52:08
【问题描述】：

我正在训练模型。但是，当我应用教程中的代码时：batch_x, batch_y = mnist.train.next_batch(50)。它表明 TensorFlow 模型中没有属性“train”。我知道这是过时的代码，我尝试转换为新版本的 TensorFlow。但是，我找不到可以与上述代码行执行相同操作的匹配代码。我敢打赌有一种方法，但我想不出一个解决方案。

我找到了一个要求我使用tf.data.Dataset.batch(batch_size) 的方法。我尝试了以下方法，但它们都不起作用。

a. batch_x, batch_y = mnist.train.next_batch(50)

b. batch_x, batch_y =  tf.data.Dataset.batch(batch_size)

c. batch_x, batch_y =  tf.data.Dataset.batch(50)

d. batch_x, batch_y = mnist.batch(50)


with tf.Session() as sess:

  #FIrst, run vars_initializer to initialize all variables
  sess.run(vars_initializer)

  for i in range(steps):

    #Each batch: 50 images
    batch_x, batch_y = mnist.train.next_batch(50)

    #Train the model
    #Dropout keep_prob (% to keep): 0.5 --> 50% will be dropped out
    sess.run(cnn_trainer, feed_dict={x: batch_x, y_true: batch_y, hold_prob: 0.5})

    #Test the model: at each 100th step
    #Run this block of code for each 100 times of training, each time run a batch
    if i % 100 == 0:
      print('ON STEP: {}'.format(i))
      print('ACCURACY: ')

      #Compare to find matches of y_pred and y_true
      matches = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))

      #Cast the matches from integers to tf.float32
      #Calculate the accuracy using the mean of matches
      acc = tf.reduce_mean(tf.cast(matches, tf.float32))

      #Test the model at each 100th step
      #Using test dataset
      #Dropout: NONE because of test, not training. 
      test_accuracy = sess.run(acc, feed_dict = {x:mnist.test.images, y_true:mnist.test.labels, hold_prob:1.0})


      print(test_accuracy)
      print('\n')

【问题讨论】：

您想从 MNIST 数据集中获取批次？
@Shubham Panchal，是的，我正在尝试获取 batch_x 和 batch_y。
请正确格式化您的代码。

标签： python-3.x tensorflow

【解决方案1】：

您可以使用tf.keras.datasets.mnist.load_data。它返回一个 Numpy 数组的元组：(x_train, y_train), (x_test, y_test)。

之后，您需要使用 Dataset API 创建数据集对象。这将创建训练数据集。可以以相同的方式创建测试数据集。

train, test = tf.keras.datasets.mnist.load_data()
dataset = tf.data.Dataset.from_tensor_slices((train[0], train[1]))

然后，要创建批处理，您需要对其应用批处理功能

dataset = dataset.batch(1)

要输出它的内容或在训练中使用它，您需要创建迭代器。下面的代码在本例 1 中创建最常见的迭代器并输出 batch_size 的元素。

iterator = dataset.make_one_shot_iterator()
with tf.Session() as sess:
    print(sess.run(iterator.get_next())

请阅读https://www.tensorflow.org/guide/datasets

【讨论】：

谢谢你 Sharky！这对我来说很有意义！我还尝试将我的 keras 版本降级到 1.12.0 或更低版本，它也可以工作！真的很感激！

【解决方案2】：

这使用了 TensorFlow 1.11.0 和 Keras，旨在展示如何使用 batch。您必须根据需要对其进行调整。

import tensorflow as tf
from tensorflow import keras as k


(x_train, y_train), (X_test, Y_test) = tf.keras.datasets.mnist.load_data()
X_train = x_train.reshape(x_train.shape[0], 28, 28,1)
y_train = tf.keras.utils.to_categorical(y_train,10)
X_test = X_test.reshape(X_test.shape[0], 28, 28,1)
Y_test = tf.keras.utils.to_categorical(Y_test,10)


train_dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
train_dataset = train_dataset.batch(32)

test_dataset = tf.data.Dataset.from_tensor_slices((X_test, Y_test))
test_dataset = test_dataset.batch(32)


model = tf.keras.models.Sequential([
    tf.keras.layers.Convolution2D(32, (2, 2), activation='relu', input_shape=(28, 28,1)),
    tf.keras.layers.MaxPool2D(pool_size=2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dropout(0.5),
     tf.keras.layers.Dense(10, activation='softmax')
])

tbCallback = [
    k.callbacks.TensorBoard(
        log_dir="D:/TensorBoard", histogram_freq=1, write_graph=True, write_images=True
    )
]


model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs = 10, steps_per_epoch = 30,validation_data=test_dataset,validation_steps=1, callbacks=tbCallback)

【讨论】：

谢谢 Mohan，我尝试将我的 keras 更改为旧版本。它有效！非常感谢！