【问题标题】:Tensorflow - Saving and restoring a modelTensorflow - 保存和恢复模型
【发布时间】:2017-03-27 00:04:31
【问题描述】:

我在 Stackoverflow 中遇到了 this question,它展示了如何保存和恢复模型。

我的问题是如何在下面的代码中做到这一点,因为我不确定如何将它与我的代码集成:

import numpy as np
import matplotlib.pyplot as plt
import cifar_tools
import tensorflow as tf

data, labels = cifar_tools.read_data('C:\\Users\\abc\\Desktop\\Testing')

x = tf.placeholder(tf.float32, [None, 150 * 150])
y = tf.placeholder(tf.float32, [None, 2])

w1 = tf.Variable(tf.random_normal([5, 5, 1, 64]))
b1 = tf.Variable(tf.random_normal([64]))

w2 = tf.Variable(tf.random_normal([5, 5, 64, 64]))
b2 = tf.Variable(tf.random_normal([64]))

w3 = tf.Variable(tf.random_normal([38*38*64, 1024]))
b3 = tf.Variable(tf.random_normal([1024]))

w_out = tf.Variable(tf.random_normal([1024, 2]))
b_out = tf.Variable(tf.random_normal([2]))

def conv_layer(x,w,b):
    conv = tf.nn.conv2d(x,w,strides=[1,1,1,1], padding = 'SAME')
    conv_with_b = tf.nn.bias_add(conv,b)
    conv_out = tf.nn.relu(conv_with_b)
    return conv_out

def maxpool_layer(conv,k=2):
    return tf.nn.max_pool(conv, ksize=[1,k,k,1], strides=[1,k,k,1], padding='SAME')

def model():
    x_reshaped = tf.reshape(x, shape=[-1, 150, 150, 1])

    conv_out1 = conv_layer(x_reshaped, w1, b1)
    maxpool_out1 = maxpool_layer(conv_out1)
    norm1 = tf.nn.lrn(maxpool_out1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)
    conv_out2 = conv_layer(norm1, w2, b2)
    norm2 = tf.nn.lrn(conv_out2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)
    maxpool_out2 = maxpool_layer(norm2)

    maxpool_reshaped = tf.reshape(maxpool_out2, [-1, w3.get_shape().as_list()[0]])
    local = tf.add(tf.matmul(maxpool_reshaped, w3), b3)
    local_out = tf.nn.relu(local)

    out = tf.add(tf.matmul(local_out, w_out), b_out)
    return out

model_op = model()

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(model_op, y))
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

correct_pred = tf.equal(tf.argmax(model_op, 1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred,tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    onehot_labels = tf.one_hot(labels, 2, on_value=1.,off_value=0.,axis=-1)
    onehot_vals = sess.run(onehot_labels)
    batch_size = 1
    for j in range(0, 5):
        print('EPOCH', j)
        for i in range(0, len(data), batch_size):
            batch_data = data[i:i+batch_size, :]
            batch_onehot_vals = onehot_vals[i:i+batch_size, :]
            _, accuracy_val = sess.run([train_op, accuracy], feed_dict={x: batch_data, y: batch_onehot_vals})
            print(i, accuracy_val)

        print('DONE WITH EPOCH')

谢谢。

【问题讨论】:

    标签: python machine-learning tensorflow neural-network conv-neural-network


    【解决方案1】:

    这是我过去用于恢复的一些示例代码。这应该在会话创建之后,但在运行模型之前完成。

    saver = tf.train.Saver()
    
    ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
    if ckpt and ckpt.model_checkpoint_path:
        saver.restore(sess, ckpt.model_checkpoint_path)
        print(ckpt.model_checkpoint_path)
        i_stopped = int(ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1])
    else:
        print('No checkpoint file found!')
        i_stopped = 0
    

    为了保存,每 1000 个批次,或者在您的情况下,您可以保存每个 epoch:

    if i % 1000 == 0:
        checkpoint_path = os.path.join(FLAGS.checkpoint_dir, 'model.ckpt')
        saver.save(sess, checkpoint_path, global_step=i)
    

    在您的代码中实现这一点应该相当简单。请记住,您必须定义将保存模型的检查点目录。

    希望这会有所帮助!

    【讨论】:

    • 感谢您的友好回复。尝试运行代码时,我得到: Traceback(最近一次调用最后一次):文件“cnn.py”,第 63 行,在 ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir) NameError: name ' FLAGS' 未定义
    • 是的,FLAGS.checkpoint_dir 可以是任何你想要的。我建议明确定义您的路径,例如:ckpt_path = /path/to/ckpts/ 您希望将检查点存储在哪里,并使用 ckpt_path 代替 FLAGS.checkpoint_dir 。除此之外,其他一切都应该没问题
    猜你喜欢
    • 2016-08-16
    • 1970-01-01
    • 2020-01-16
    • 2023-04-09
    • 1970-01-01
    • 2019-11-25
    • 1970-01-01
    • 2018-04-25
    • 2016-04-02
    相关资源
    最近更新 更多