带有 Tensorflow 的 CNN 模型答案

【问题标题】：CNN model with Tensorflow带有 Tensorflow 的 CNN 模型
【发布时间】：2018-06-09 15:15:04
【问题描述】：

我正在通过在 Tensorflow 中构建卷积神经网络模型来进行字符识别。我的模型有 2 个 Conv 层，后跟 2 个全连接层。我有大约 78K 的训练图像和 13K 的测试图像。当我执行模型时，我在测试集上得到了大约 92.xx% 的准确率。当我在 Tensorboard 上可视化我的准确率和损失曲线时。我得到了一条垂直线，但我不知道为什么会这样？我得到了这样的曲线Accuracy and Cross Entropy curve when viewed on tensorboard。

权重和偏差的分布曲线也显示一条垂直线Left side shows testing parameters (weights and bias) and right side shows training parameters on first conv layer

我们非常感谢这方面的任何帮助。！！

def conv_layer(input, size_in, size_out, name="conv"):
with tf.name_scope(name):
w = tf.Variable(tf.random_normal([5, 5, size_in, size_out], stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[size_out]), name="B")
conv = tf.nn.conv2d(input, w, strides=[1, 1, 1, 1],padding="VALID")
act = tf.nn.relu(conv + b)
tf.summary.histogram("weights", w)
tf.summary.histogram("biases", b)
tf.summary.histogram("activations", act)
return tf.nn.max_pool(act, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")`

def fc_layer(input, size_in, size_out, name="fc"):
with tf.name_scope(name):
w = tf.Variable(tf.random_normal([size_in, size_out], stddev=0.1), name="W")  # Truncated_normal
b = tf.Variable(tf.constant(0.1, shape=[size_out]), name="B")
act = tf.matmul(input, w) + b
tf.summary.histogram("weights", w)
tf.summary.histogram("biases", b)
tf.summary.histogram("activations", act)
return act

def model(use_two_conv, use_two_fc):
sess = tf.Session()
x = tf.placeholder(tf.float32, shape=[None, 1024], name="x")
x_image = tf.reshape(x, [-1, 32, 32, 1])
tf.summary.image('input', x_image, 3)
y = tf.placeholder(tf.float32, shape=[None,46], name="labels")

if use_two_conv:
  conv1 = conv_layer(x_image, 1, 4, "conv1")
  conv_out = conv_layer(conv1,4,16,"conv2")    
else:
  conv1 = conv_layer(x_image, 1, 16, "conv1")
  conv_out = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")

flattened = tf.reshape(conv_out, [-1, 5 * 5 * 16])

if use_two_fc:
  fc1 = fc_layer(flattened, 5 * 5 * 16, 200, "fc1")
  relu = tf.nn.relu(fc1)
  tf.summary.histogram("fc1/relu", relu)
  logits = fc_layer(fc1, 200, 46, "fc2")        

else:
  logits = fc_layer(flattened, 5*5*16, 46, "fc")

【问题讨论】：

我已经多次遇到这个确切的问题，但是很难诊断没有代码的代码问题！你会发布你的培训代码吗？您可能写了writer.add_summary(current_summary) 而不是writer.add_summary(current_summary, epoch)。如果您发布代码，我很乐意将其发布为答案。
感谢@DylanF。有效。但是我还有一个问题，在这个训练有素的模型上，我想测试一张新图像，但是我应该如何恢复函数 conv_layer() 和 fc_layer() 中定义的变量，因为有像“conv1/W:0”、“conv2”这样的变量/W：0”。我应该如何将这个经过训练的模型用于新图像？
这真是一个全新的问题。不过，在您提问之前，您可能需要参考 this question。我使用Tom's answer，因为它是最新的方法。但接受的答案也是一个有效的解决方案！

标签： tensorflow machine-learning convolution tensorboard

【解决方案1】：

我以前遇到过这个问题，是使用的结果

writer.add_summary(current_summary)

而不是

writer.add_summary(current_summary, epoch)

（使用通用变量名，因为提问者代码的相关部分未发布。）例如，

summary_op = tf.summary.merge_all()
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    writer = tf.summary.FileWriter("/Whatever/Path", sess.graph)
    for iteration in range(1001):
        if epoch % 100 == 0:
            _, current_summary = sess.run([training_op, summary_op])
            writer.add_summary(current_summary, iteration)
        else:
            _ = sess.run(training_op)

【讨论】：