TensorFlow 实现卷积神经网络(CNN)

1. 知识点:

  1. 计算卷积后的图片的大小

    1. 在最后的全连接层使用,将卷积后的所有特征转换成1维向量,进行全连接
  2. 卷积核的使用

    1. 卷积核大小
    2. 卷积核步长
    3. 填充值
  3. 卷积核的特征数量

  4. 卷积函数的使用

  5. 池化函数的使用

2. 出错点:

  1. 卷积后的图像的大小
  2. 最后输出层输入大小

3. 卷积层(Conv Layer)的输出张量(图像)的大小

定义如下:

O = 输出图像的尺寸。

I = 输入图像的尺寸。

K = 卷积层的核尺寸

N = 核数量

S = 移动步长

P =填充数

基本公式为
O=IK+2×PS+1 O = \bigg\lceil\frac{I-K+2\times P}{S}\bigg\rceil+1

padding=’SAME‘ 填充0

**默认:**P = 0

符号:\lceil \quad \rceil ):向上取整

输出图像尺寸的计算公式如下:
O=IK+2×0S+1 O = \bigg\lceil\frac{I-K+2\times 0}{S}\bigg\rceil+1
例如:输入图片28x28,卷积核3x3 ,步长h=2, w=2,计算结果:

\bigg\lceil\frac{28-3+2\times 0}{2}\bigg\rceil+1=14

(28 -3)/2 = 12.5,向上取整为13 所以最后为14

注意:

  1. 当步长为1时,输出图片的尺寸等于输出图片的尺寸
  2. 如何高和宽 不同,那么就分开求值

padding=’VALID‘ 不填充(需验证)

默认:p = 1

公式为:
O=IK+2×pS1 O = \bigg\lceil\frac{I-K+2\times p}{S}\bigg\rceil-1

例如: 输入图片28x28,卷积核3x3 ,步长h=2, w=2,计算结果:
O=283+2×121=13 O = \bigg\lceil\frac{28-3+2\times 1}{2}\bigg\rceil-1 = 13

(28-3+2)/2 =13.5 向上取整 14 减 1 = 13

4. 池化层(Conv Layer)的输出张量(图像)的大小

TensorFlow中:池化操作:(tf.nn.max_pool())

步长为1 时,图片尺寸不变

步长相同,尺寸不同时,图片尺寸相同

步长不同,尺寸相同时,图片尺寸不同

定义如下:

O = 输出图像的尺寸。
I = 输入图像的尺寸。
S = 移动步长
PS = 池化层尺寸

**注意:**不同于卷积层,池化层的输出通道数不改变

padding=’SAME‘ 填充0

输出图像尺寸的计算公式如下:

符号:($\lceil\quad \rceil$) 代表向上取整

O =\bigg\lceil\frac{I - P_S}{S}\bigg\rceil+1

**例:**原图=[1,28,28,1] ksize=[1,2,2,1], strides=[1,3,3,1] 结果:(1, 10, 10, 1),计算过程

O=\bigg\lceil\frac{28-2}{3} \bigg\rceil +1 = 10

(28-2)/3 = 8.6666667 向上取整为:9 加1 = 10

在TensorFlow的(填充0)池化结果

示例代码:

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
# 获取数据集
mnist = input_data.read_data_sets(r'E:\PycharmProjects\TensorflowTest\MNIST_data', one_hot=True)
# in_x = tf.Variable(tf.zeros([1,28,28,1]), dtype=tf.float32)
in_x = tf.placeholder(dtype=tf.float32, shape=[1,28,28,1])

pool1 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='SAME')
pool2 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')
pool3 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,3,3,1],padding='SAME')
pool4 = tf.nn.max_pool(in_x,ksize=[1,6,6,1],strides=[1,2,2,1],padding='SAME')
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    xs, ys = mnist.test.next_batch(1)
    test_xs = np.array(xs).reshape((1,28,28,1))
    p1, p2, p3, p4 = sess.run([pool1,pool2,pool3,pool4], feed_dict={in_x: test_xs})
    print("填充方式:SMAE")
    print('p1: ', 'ksize=[1,2,2,1], strides=[1,1,1,1]', np.array(p1).shape)
    print('p2: ', 'ksize=[1,2,2,1], strides=[1,2,2,1]', np.array(p2).shape)
    print('p3: ', 'ksize=[1,2,2,1], strides=[1,3,3,1]', np.array(p3).shape)
    print('p4: ', 'ksize=[1,6,6,1], strides=[1,2,2,1]', np.array(p4).shape)
    p1 = np.array(p1).reshape((28, 28))
    p2 = np.array(p2).reshape((14, 14))
    p3 = np.array(p3).reshape((10, 10))
    p4 = np.array(p4).reshape((14, 14))

    f, a = plt.subplots(2, 2, figsize=(5, 5))

    a[0][0].imshow(p1)
    a[0][1].imshow(p2)
    a[1][0].imshow(p3)
    a[1][1].imshow(p4)
    plt.draw()
    plt.show()

显示结果:

Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-labels-idx1-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-labels-idx1-ubyte.gz
填充方式:SMAE
p1:  ksize=[1,2,2,1], strides=[1,1,1,1] (1, 28, 28, 1)
p2:  ksize=[1,2,2,1], strides=[1,2,2,1] (1, 14, 14, 1)
p3:  ksize=[1,2,2,1], strides=[1,3,3,1] (1, 10, 10, 1)
p4:  ksize=[1,6,6,1], strides=[1,2,2,1] (1, 14, 14, 1)

填充值为0的池化效果图:

第一行第一张为:p1 第一行第二张为:p2

第二行第一张为:p3 第二行第二张为:p4

TensorFlow 实现卷积神经网络(CNN)

padding=’VALID‘ 不填充

输出图像尺寸的计算公式如下:

符号:(\lfloor \quad \rfloor) 代表向下取整
O=IPSS+1 O =\bigg\lfloor\frac{I - P_S}{S}\bigg\rfloor+1
**例:**原图=[1,28,28,1] ksize=[1,2,2,1], strides=[1,3,3,1] 结果:(1, 9, 9, 1),计算过程
O=2823+1=9 O=\bigg\lfloor\frac{28-2}{3} \bigg\rfloor +1 = 9
(28-2)/3 = 8.6666667 向下取整为:8 加1 = 9

在TensorFlow的(不填充)池化结果:

示例代码:

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
# 获取数据集
mnist = input_data.read_data_sets(r'E:\PycharmProjects\TensorflowTest\MNIST_data', one_hot=True)
# in_x = tf.Variable(tf.zeros([1,28,28,1]), dtype=tf.float32)
in_x = tf.placeholder(dtype=tf.float32, shape=[1,28,28,1])
pool5 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='VALID')
pool6 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='VALID')
pool7 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,3,3,1],padding='VALID')
pool8 = tf.nn.max_pool(in_x,ksize=[1,6,6,1],strides=[1,2,2,1],padding='VALID')
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)
    xs, ys = mnist.test.next_batch(1)
    test_xs = np.array(xs).reshape((1,28,28,1))
    p5, p6, p7, p8 = sess.run([pool5, pool6, pool7, pool8], feed_dict={in_x: test_xs})
    
    print("填充方式:VALID")
    print('p5: ', 'ksize=[1,2,2,1], strides=[1,1,1,1]', np.array(p5).shape)
    print('p6: ', 'ksize=[1,2,2,1], strides=[1,2,2,1]', np.array(p6).shape)
    print('p7: ', 'ksize=[1,2,2,1], strides=[1,3,3,1]', np.array(p7).shape)
    print('p8: ', 'ksize=[1,6,6,1], strides=[1,2,2,1]', np.array(p8).shape)
    
    p5 = np.array(p5).reshape((27, 27))
    p6 = np.array(p6).reshape((14, 14))
    p7 = np.array(p7).reshape((9, 9))
    p8 = np.array(p8).reshape((12, 12))

    f, a = plt.subplots(2, 2, figsize=(5, 5))

    a[0][0].imshow(p5)
    a[0][1].imshow(p6)
    a[1][0].imshow(p7)
    a[1][1].imshow(p8)
    plt.draw()
    plt.show()

显示结果:

# 输出
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-labels-idx1-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-labels-idx1-ubyte.gz
填充方式:VALID
p5:  ksize=[1,2,2,1], strides=[1,1,1,1] (1, 27, 27, 1)
p6:  ksize=[1,2,2,1], strides=[1,2,2,1] (1, 14, 14, 1)
p7:  ksize=[1,2,2,1], strides=[1,3,3,1] (1, 9, 9, 1)
p8:  ksize=[1,6,6,1], strides=[1,2,2,1] (1, 12, 12, 1)

不填充的池化效果图:

第一行第一张为:p5 第一行第二张为:p6

第二行第一张为:p7 第二行第二张为:p8

TensorFlow 实现卷积神经网络(CNN)

5. CNN网络结构:

5.1 CNN网络

CNN卷积操作 输入图像shape **函数 卷积核shape 步长 输出图像shape
第一层卷积 [batch,28,28,1] LeakyReLU [3,3,1,16] [1,2,2,1] [batch,14,14,16]
第二层卷积 [batch,14,14,16] LeakyReLU [3,3,16,32] [1,2,2,1] [batch,7,7,32]
第三层卷积 [batch,7,7,32] LeakyReLU [3,3,32,64] [1,2,2,1] [batch,4,4,64]
第四层卷积 [batch,4,4,64] LeakyReLU [2,2,64,64] [1,2,2,1] [batch,2,2,64]

5.2 全连接网络

MLP全连接网络 输入shape **函数 权重shape 偏值shape 神经元个数
第一层全连接 [batch,256] LeakyReLU [256,100] [100] 100
第二层全连接 [batch,100] LeakyReLU [100,10] [10] 10

5.3 主网络

主网络
获取输入 占位符 类型:tf.float32 shape:[batch,28,28,1]
获取标签 占位符 类型:tf.float32 shape:[batch,10]
前向结构 获取卷积输出 [batch,2,2,64]
更改形状 [batch,2*2*64]
获取全连接输出 [batch,10]
后向结构 损失 全连接输出 获取的标签 均值平方差

6. 实现代码:

(1)网络结构实现代码

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
import matplotlib.pyplot as plt
import time
mnist = input_data.read_data_sets(r'E:\PycharmProjects\TensorflowTest\MNIST_data',one_hot=True)

class Convolution:
    def __init__(self):
        # 输出:[batch,14,14,16]
        self.filter1 = tf.Variable(tf.truncated_normal([3,3,1,16], stddev=0.1))
        self.b1 = tf.Variable(tf.zeros([16]))
        # 输出:[batch,7,7,64]
        self.filter2 = tf.Variable(tf.truncated_normal([3,3,16,32], stddev=0.1))
        self.b2 = tf.Variable(tf.zeros([32]))
        # 输出:[batch,4,4,64]
        self.filter3 = tf.Variable(tf.truncated_normal([3,3,32,64], stddev=0.1))
        self.b3 = tf.Variable(tf.zeros([64]))
        # 输出:[batch,2,2,64]
        self.filter4 = tf.Variable(tf.truncated_normal([2,2,64,64], stddev=0.1))
        self.b4 = tf.Variable(tf.zeros([64]))

    def forward(self,in_x):
        conv1 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(in_x,
                                                     self.filter1,
                                                     [1, 2, 2, 1],
                                                     padding='SAME'), self.b1))
        conv2 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(conv1,
                                                     self.filter2,
                                                     [1, 2, 2, 1],
                                                     padding='SAME'), self.b2))
        conv3 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(conv2,
                                                     self.filter3,
                                                     [1, 2, 2, 1],
                                                     padding='SAME'), self.b3))
        conv4 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(conv3,
                                                    self.filter4,
                                                    [1, 2, 2, 1],
                                                    padding="SAME"), self.b4))
        return conv4
class MLP:
    def __init__(self):
        self.in_w = tf.Variable(tf.truncated_normal([2*2*64, 100], stddev=0.1))
        self.in_b = tf.Variable(tf.truncated_normal([100]))

        self.out_w = tf.Variable(tf.truncated_normal([100, 10], stddev=0.1))
        self.out_b = tf.Variable(tf.zeros([10]))
    def forward(self,mlp_in_x):

        mlp_layer = tf.nn.leaky_relu(tf.add(tf.matmul(mlp_in_x, self.in_w), self.in_b))
        out_layer = tf.nn.leaky_relu(tf.add(tf.matmul(mlp_layer, self.out_w), self.out_b))

        return out_layer
class CNNnet:
    def __init__(self):

        self.conv = Convolution()
        self.mlp = MLP()

        self.in_x = tf.placeholder(dtype=tf.float32, shape=[None,28,28,1])
        self.in_y = tf.placeholder(dtype=tf.float32, shape=[None,10])

        self.forward()
        self.backward()

    def forward(self):
        # (100, 2, 2, 64)
        self.conv_layer = self.conv.forward(self.in_x)
        mlp_in_x = tf.reshape(self.conv_layer,[-1,2*2*64])
        self.out_layer = self.mlp.forward(mlp_in_x)
    def backward(self):
        # pass
        self.loss = tf.reduce_mean((self.out_layer-self.in_y)**2)
        self.opt = tf.train.AdamOptimizer().minimize(self.loss)

(2)训练代码

if __name__ == '__main__':
    cnn = CNNnet()
    with tf.Session() as sess:
        init = tf.global_variables_initializer()
        sess.run(init)
        saver = tf.train.Saver()
        loss_sum = []
        time1 = time.time()
        for epoch in range(10000):
            xs,xy = mnist.train.next_batch(100)
            loss,_ = sess.run([cnn.loss, cnn.opt], feed_dict={cnn.in_x: np.reshape(xs,[100,28,28,1]), cnn.in_y:xy})
            if epoch% 200 == 0:
                loss_sum.append(loss)
                saver.save(sess, r'E:\PycharmProjects\TensorflowTest\log\CNNTrain1.ckpt')
                test_xs,test_xy = mnist.test.next_batch(5)
                out_layer = sess.run([cnn.out_layer], feed_dict={cnn.in_x: np.reshape(test_xs,[5,28,28,1])})
                out_layer = np.array(out_layer).reshape((5,10))

                out = np.array(out_layer).argmax(axis=1)
                test_y = np.array(test_xy).argmax(axis=1)
                accuracy = np.mean(out == test_y)
                print('epoch:\t',epoch, 'loss:\t',loss,'accuracy:\t',accuracy,'原始数据:',test_y,"预测数据:",test_y)
        time2 = time.time()
        print('训练时间:\t',time2-time1)
        plt.figure('CNN_Loss图')
        plt.plot(loss_sum,label='Loss')
        plt.legend()
        plt.show()

代码输出:

epoch:	 0 loss:	 0.33637196 accuracy:	 0.0 原始数据: [1 3 6 5 3] 预测数据: [1 3 6 5 3]
epoch:	 200 loss:	 0.01811684 accuracy:	 1.0 原始数据: [6 1 7 8 4] 预测数据: [6 1 7 8 4]
epoch:	 400 loss:	 0.009946772 accuracy:	 1.0 原始数据: [8 1 3 2 0] 预测数据: [8 1 3 2 0]
epoch:	 600 loss:	 0.00800592 accuracy:	 1.0 原始数据: [6 9 2 9 8] 预测数据: [6 9 2 9 8]
。。。

损失图:

TensorFlow 实现卷积神经网络(CNN)

(3)测试代码:获取准确率

if __name__ == '__main__':
    cnn = CNNnet()
    with tf.Session() as sess:
        init = tf.global_variables_initializer()
        sess.run(init)
        saver = tf.train.Saver()
        accuracy_sum = []
        time1 = time.time()
        for epoch in range(1000):
            saver.restore(sess, r'E:\PycharmProjects\TensorflowTest\log\CNNTrain1.ckpt')
            test_xs,test_xy = mnist.test.next_batch(100)
            out_layer = sess.run([cnn.out_layer], feed_dict={cnn.in_x: np.reshape(test_xs,[100,28,28,1])})
            out_layer = np.array(out_layer).reshape((100,10))

            out = np.array(out_layer).argmax(axis=1)
            test_y = np.array(test_xy).argmax(axis=1)
            accuracy = np.mean(out == test_y)
            accuracy_sum.append(accuracy)
            print('epoch:\t',epoch, 'accuracy:\t',accuracy,)
    time2 = time.time()
    print('训练时间:\t',time2-time1)
    total_accuracy = sum(accuracy_sum)/len(accuracy_sum)
    print('总准确率:\t',total_accuracy)
    #  正常显示中文标签
    plt.rcParams['font.sans-serif'] = ['SimHei']
    # 正常显示负号
    plt.rcParams['axes.unicode_minus'] = False
    plt.figure('CNN_Accuracy图')
    plt.plot(accuracy_sum,'o',label='Accuracy')
    plt.title('Accuracy:{:.2f}%'.format(total_accuracy*100))
    plt.legend()
    plt.show()

代码输出:

epoch:	 994 accuracy:	 0.99
epoch:	 995 accuracy:	 1.0
epoch:	 996 accuracy:	 0.99
epoch:	 997 accuracy:	 0.99
epoch:	 99998 accuracy:	 1.0
epoch:	 99999 accuracy:	 0.99
训练时间:	 5169.121257305145 
总准确率:	 0.9907000000004801

准确率图:

TensorFlow 实现卷积神经网络(CNN)

相关文章: