TensorFlow 实现卷积神经网络(CNN)
文章目录
1. 知识点:
-
计算卷积后的图片的大小
- 在最后的全连接层使用,将卷积后的所有特征转换成1维向量,进行全连接
-
卷积核的使用
- 卷积核大小
- 卷积核步长
- 填充值
-
卷积核的特征数量
-
卷积函数的使用
-
池化函数的使用
2. 出错点:
- 卷积后的图像的大小
- 最后输出层输入大小
3. 卷积层(Conv Layer)的输出张量(图像)的大小
定义如下:
O = 输出图像的尺寸。
I = 输入图像的尺寸。
K = 卷积层的核尺寸
N = 核数量
S = 移动步长
P =填充数
基本公式为:
padding=’SAME‘ 填充0
**默认:**P = 0
符号:( ):向上取整
输出图像尺寸的计算公式如下:
例如:输入图片28x28,卷积核3x3 ,步长h=2, w=2,计算结果:
\bigg\lceil\frac{28-3+2\times 0}{2}\bigg\rceil+1=14
(28 -3)/2 = 12.5,向上取整为13 所以最后为14
注意:
- 当步长为1时,输出图片的尺寸等于输出图片的尺寸
- 如何高和宽 不同,那么就分开求值
padding=’VALID‘ 不填充(需验证)
默认:p = 1
公式为:
例如: 输入图片28x28,卷积核3x3 ,步长h=2, w=2,计算结果:
(28-3+2)/2 =13.5 向上取整 14 减 1 = 13
4. 池化层(Conv Layer)的输出张量(图像)的大小
在TensorFlow中:池化操作:(tf.nn.max_pool())
步长为1 时,图片尺寸不变
步长相同,尺寸不同时,图片尺寸相同
步长不同,尺寸相同时,图片尺寸不同
定义如下:
O = 输出图像的尺寸。
I = 输入图像的尺寸。
S = 移动步长
PS = 池化层尺寸
**注意:**不同于卷积层,池化层的输出通道数不改变
padding=’SAME‘ 填充0
输出图像尺寸的计算公式如下:
符号:($\lceil\quad \rceil$) 代表向上取整
O =\bigg\lceil\frac{I - P_S}{S}\bigg\rceil+1
**例:**原图=[1,28,28,1] ksize=[1,2,2,1], strides=[1,3,3,1] 结果:(1, 10, 10, 1),计算过程
O=\bigg\lceil\frac{28-2}{3} \bigg\rceil +1 = 10
(28-2)/3 = 8.6666667 向上取整为:9 加1 = 10
在TensorFlow的(填充0)池化结果
示例代码:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
# 获取数据集
mnist = input_data.read_data_sets(r'E:\PycharmProjects\TensorflowTest\MNIST_data', one_hot=True)
# in_x = tf.Variable(tf.zeros([1,28,28,1]), dtype=tf.float32)
in_x = tf.placeholder(dtype=tf.float32, shape=[1,28,28,1])
pool1 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='SAME')
pool2 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')
pool3 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,3,3,1],padding='SAME')
pool4 = tf.nn.max_pool(in_x,ksize=[1,6,6,1],strides=[1,2,2,1],padding='SAME')
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
xs, ys = mnist.test.next_batch(1)
test_xs = np.array(xs).reshape((1,28,28,1))
p1, p2, p3, p4 = sess.run([pool1,pool2,pool3,pool4], feed_dict={in_x: test_xs})
print("填充方式:SMAE")
print('p1: ', 'ksize=[1,2,2,1], strides=[1,1,1,1]', np.array(p1).shape)
print('p2: ', 'ksize=[1,2,2,1], strides=[1,2,2,1]', np.array(p2).shape)
print('p3: ', 'ksize=[1,2,2,1], strides=[1,3,3,1]', np.array(p3).shape)
print('p4: ', 'ksize=[1,6,6,1], strides=[1,2,2,1]', np.array(p4).shape)
p1 = np.array(p1).reshape((28, 28))
p2 = np.array(p2).reshape((14, 14))
p3 = np.array(p3).reshape((10, 10))
p4 = np.array(p4).reshape((14, 14))
f, a = plt.subplots(2, 2, figsize=(5, 5))
a[0][0].imshow(p1)
a[0][1].imshow(p2)
a[1][0].imshow(p3)
a[1][1].imshow(p4)
plt.draw()
plt.show()
显示结果:
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-labels-idx1-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-labels-idx1-ubyte.gz
填充方式:SMAE
p1: ksize=[1,2,2,1], strides=[1,1,1,1] (1, 28, 28, 1)
p2: ksize=[1,2,2,1], strides=[1,2,2,1] (1, 14, 14, 1)
p3: ksize=[1,2,2,1], strides=[1,3,3,1] (1, 10, 10, 1)
p4: ksize=[1,6,6,1], strides=[1,2,2,1] (1, 14, 14, 1)
填充值为0的池化效果图:
第一行第一张为:p1 第一行第二张为:p2
第二行第一张为:p3 第二行第二张为:p4
padding=’VALID‘ 不填充
输出图像尺寸的计算公式如下:
符号:() 代表向下取整
**例:**原图=[1,28,28,1] ksize=[1,2,2,1], strides=[1,3,3,1] 结果:(1, 9, 9, 1),计算过程
(28-2)/3 = 8.6666667 向下取整为:8 加1 = 9
在TensorFlow的(不填充)池化结果:
示例代码:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
# 获取数据集
mnist = input_data.read_data_sets(r'E:\PycharmProjects\TensorflowTest\MNIST_data', one_hot=True)
# in_x = tf.Variable(tf.zeros([1,28,28,1]), dtype=tf.float32)
in_x = tf.placeholder(dtype=tf.float32, shape=[1,28,28,1])
pool5 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='VALID')
pool6 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='VALID')
pool7 = tf.nn.max_pool(in_x,ksize=[1,2,2,1],strides=[1,3,3,1],padding='VALID')
pool8 = tf.nn.max_pool(in_x,ksize=[1,6,6,1],strides=[1,2,2,1],padding='VALID')
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
xs, ys = mnist.test.next_batch(1)
test_xs = np.array(xs).reshape((1,28,28,1))
p5, p6, p7, p8 = sess.run([pool5, pool6, pool7, pool8], feed_dict={in_x: test_xs})
print("填充方式:VALID")
print('p5: ', 'ksize=[1,2,2,1], strides=[1,1,1,1]', np.array(p5).shape)
print('p6: ', 'ksize=[1,2,2,1], strides=[1,2,2,1]', np.array(p6).shape)
print('p7: ', 'ksize=[1,2,2,1], strides=[1,3,3,1]', np.array(p7).shape)
print('p8: ', 'ksize=[1,6,6,1], strides=[1,2,2,1]', np.array(p8).shape)
p5 = np.array(p5).reshape((27, 27))
p6 = np.array(p6).reshape((14, 14))
p7 = np.array(p7).reshape((9, 9))
p8 = np.array(p8).reshape((12, 12))
f, a = plt.subplots(2, 2, figsize=(5, 5))
a[0][0].imshow(p5)
a[0][1].imshow(p6)
a[1][0].imshow(p7)
a[1][1].imshow(p8)
plt.draw()
plt.show()
显示结果:
# 输出
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\train-labels-idx1-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-images-idx3-ubyte.gz
Extracting E:\PycharmProjects\TensorflowTest\MNIST_data\t10k-labels-idx1-ubyte.gz
填充方式:VALID
p5: ksize=[1,2,2,1], strides=[1,1,1,1] (1, 27, 27, 1)
p6: ksize=[1,2,2,1], strides=[1,2,2,1] (1, 14, 14, 1)
p7: ksize=[1,2,2,1], strides=[1,3,3,1] (1, 9, 9, 1)
p8: ksize=[1,6,6,1], strides=[1,2,2,1] (1, 12, 12, 1)
不填充的池化效果图:
第一行第一张为:p5 第一行第二张为:p6
第二行第一张为:p7 第二行第二张为:p8
5. CNN网络结构:
5.1 CNN网络
| CNN卷积操作 | 输入图像shape | **函数 | 卷积核shape | 步长 | 输出图像shape |
|---|---|---|---|---|---|
| 第一层卷积 | [batch,28,28,1] | LeakyReLU | [3,3,1,16] | [1,2,2,1] | [batch,14,14,16] |
| 第二层卷积 | [batch,14,14,16] | LeakyReLU | [3,3,16,32] | [1,2,2,1] | [batch,7,7,32] |
| 第三层卷积 | [batch,7,7,32] | LeakyReLU | [3,3,32,64] | [1,2,2,1] | [batch,4,4,64] |
| 第四层卷积 | [batch,4,4,64] | LeakyReLU | [2,2,64,64] | [1,2,2,1] | [batch,2,2,64] |
5.2 全连接网络
| MLP全连接网络 | 输入shape | **函数 | 权重shape | 偏值shape | 神经元个数 |
|---|---|---|---|---|---|
| 第一层全连接 | [batch,256] | LeakyReLU | [256,100] | [100] | 100 |
| 第二层全连接 | [batch,100] | LeakyReLU | [100,10] | [10] | 10 |
5.3 主网络
| 主网络 | ||||
|---|---|---|---|---|
| 获取输入 | 占位符 | 类型:tf.float32
|
shape:[batch,28,28,1] | |
| 获取标签 | 占位符 | 类型:tf.float32
|
shape:[batch,10] | |
| 前向结构 | 获取卷积输出 | [batch,2,2,64] | ||
| 更改形状 | [batch,2*2*64] | |||
| 获取全连接输出 | [batch,10] | |||
| 后向结构 | 损失 | 全连接输出 | 获取的标签 | 均值平方差 |
6. 实现代码:
(1)网络结构实现代码
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
import matplotlib.pyplot as plt
import time
mnist = input_data.read_data_sets(r'E:\PycharmProjects\TensorflowTest\MNIST_data',one_hot=True)
class Convolution:
def __init__(self):
# 输出:[batch,14,14,16]
self.filter1 = tf.Variable(tf.truncated_normal([3,3,1,16], stddev=0.1))
self.b1 = tf.Variable(tf.zeros([16]))
# 输出:[batch,7,7,64]
self.filter2 = tf.Variable(tf.truncated_normal([3,3,16,32], stddev=0.1))
self.b2 = tf.Variable(tf.zeros([32]))
# 输出:[batch,4,4,64]
self.filter3 = tf.Variable(tf.truncated_normal([3,3,32,64], stddev=0.1))
self.b3 = tf.Variable(tf.zeros([64]))
# 输出:[batch,2,2,64]
self.filter4 = tf.Variable(tf.truncated_normal([2,2,64,64], stddev=0.1))
self.b4 = tf.Variable(tf.zeros([64]))
def forward(self,in_x):
conv1 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(in_x,
self.filter1,
[1, 2, 2, 1],
padding='SAME'), self.b1))
conv2 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(conv1,
self.filter2,
[1, 2, 2, 1],
padding='SAME'), self.b2))
conv3 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(conv2,
self.filter3,
[1, 2, 2, 1],
padding='SAME'), self.b3))
conv4 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(conv3,
self.filter4,
[1, 2, 2, 1],
padding="SAME"), self.b4))
return conv4
class MLP:
def __init__(self):
self.in_w = tf.Variable(tf.truncated_normal([2*2*64, 100], stddev=0.1))
self.in_b = tf.Variable(tf.truncated_normal([100]))
self.out_w = tf.Variable(tf.truncated_normal([100, 10], stddev=0.1))
self.out_b = tf.Variable(tf.zeros([10]))
def forward(self,mlp_in_x):
mlp_layer = tf.nn.leaky_relu(tf.add(tf.matmul(mlp_in_x, self.in_w), self.in_b))
out_layer = tf.nn.leaky_relu(tf.add(tf.matmul(mlp_layer, self.out_w), self.out_b))
return out_layer
class CNNnet:
def __init__(self):
self.conv = Convolution()
self.mlp = MLP()
self.in_x = tf.placeholder(dtype=tf.float32, shape=[None,28,28,1])
self.in_y = tf.placeholder(dtype=tf.float32, shape=[None,10])
self.forward()
self.backward()
def forward(self):
# (100, 2, 2, 64)
self.conv_layer = self.conv.forward(self.in_x)
mlp_in_x = tf.reshape(self.conv_layer,[-1,2*2*64])
self.out_layer = self.mlp.forward(mlp_in_x)
def backward(self):
# pass
self.loss = tf.reduce_mean((self.out_layer-self.in_y)**2)
self.opt = tf.train.AdamOptimizer().minimize(self.loss)
(2)训练代码
if __name__ == '__main__':
cnn = CNNnet()
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
saver = tf.train.Saver()
loss_sum = []
time1 = time.time()
for epoch in range(10000):
xs,xy = mnist.train.next_batch(100)
loss,_ = sess.run([cnn.loss, cnn.opt], feed_dict={cnn.in_x: np.reshape(xs,[100,28,28,1]), cnn.in_y:xy})
if epoch% 200 == 0:
loss_sum.append(loss)
saver.save(sess, r'E:\PycharmProjects\TensorflowTest\log\CNNTrain1.ckpt')
test_xs,test_xy = mnist.test.next_batch(5)
out_layer = sess.run([cnn.out_layer], feed_dict={cnn.in_x: np.reshape(test_xs,[5,28,28,1])})
out_layer = np.array(out_layer).reshape((5,10))
out = np.array(out_layer).argmax(axis=1)
test_y = np.array(test_xy).argmax(axis=1)
accuracy = np.mean(out == test_y)
print('epoch:\t',epoch, 'loss:\t',loss,'accuracy:\t',accuracy,'原始数据:',test_y,"预测数据:",test_y)
time2 = time.time()
print('训练时间:\t',time2-time1)
plt.figure('CNN_Loss图')
plt.plot(loss_sum,label='Loss')
plt.legend()
plt.show()
代码输出:
epoch: 0 loss: 0.33637196 accuracy: 0.0 原始数据: [1 3 6 5 3] 预测数据: [1 3 6 5 3]
epoch: 200 loss: 0.01811684 accuracy: 1.0 原始数据: [6 1 7 8 4] 预测数据: [6 1 7 8 4]
epoch: 400 loss: 0.009946772 accuracy: 1.0 原始数据: [8 1 3 2 0] 预测数据: [8 1 3 2 0]
epoch: 600 loss: 0.00800592 accuracy: 1.0 原始数据: [6 9 2 9 8] 预测数据: [6 9 2 9 8]
。。。
损失图:
(3)测试代码:获取准确率
if __name__ == '__main__':
cnn = CNNnet()
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
saver = tf.train.Saver()
accuracy_sum = []
time1 = time.time()
for epoch in range(1000):
saver.restore(sess, r'E:\PycharmProjects\TensorflowTest\log\CNNTrain1.ckpt')
test_xs,test_xy = mnist.test.next_batch(100)
out_layer = sess.run([cnn.out_layer], feed_dict={cnn.in_x: np.reshape(test_xs,[100,28,28,1])})
out_layer = np.array(out_layer).reshape((100,10))
out = np.array(out_layer).argmax(axis=1)
test_y = np.array(test_xy).argmax(axis=1)
accuracy = np.mean(out == test_y)
accuracy_sum.append(accuracy)
print('epoch:\t',epoch, 'accuracy:\t',accuracy,)
time2 = time.time()
print('训练时间:\t',time2-time1)
total_accuracy = sum(accuracy_sum)/len(accuracy_sum)
print('总准确率:\t',total_accuracy)
# 正常显示中文标签
plt.rcParams['font.sans-serif'] = ['SimHei']
# 正常显示负号
plt.rcParams['axes.unicode_minus'] = False
plt.figure('CNN_Accuracy图')
plt.plot(accuracy_sum,'o',label='Accuracy')
plt.title('Accuracy:{:.2f}%'.format(total_accuracy*100))
plt.legend()
plt.show()
代码输出:
epoch: 994 accuracy: 0.99
epoch: 995 accuracy: 1.0
epoch: 996 accuracy: 0.99
epoch: 997 accuracy: 0.99
epoch: 99998 accuracy: 1.0
epoch: 99999 accuracy: 0.99
训练时间: 5169.121257305145
总准确率: 0.9907000000004801