深度学习-**函数

**函数总结

作用： **函数给神经元引入了非线性因素，使得神经网络可以任意逼近任何非线性函数，这样神经网络就可以应用到众多的非线性模型中

深度学习-**函数

**函数图

sigmod函数（Logistic函数）

公式

$f(x)=\cfrac{1}{1+e^x}$
求导
$f(x)' = f(x)(1-f(x))$
优点：
1. Sigmoid函数的输出映射在(0,1)之间，单调连续，输出范围有限，优化稳定，可以用作输出层
2. 求导容易
缺点：
1. 由于其软饱和性，容易产生梯度消失，导致训练出现问题
2. 其输出并不是以0为中心的，使权重更新效率降低
3. sigmod函数要进行指数运算，这个对于计算机来说是比较慢的
4. sigmod函数饱和性：**函数计算量大，反向传播求误差梯度时，求导涉及除法反向传播时，很容易就会出现梯度消失的情况，从而无法完成深层网络的训练

tanh函数（双曲正切函数）

简介：tanh是双曲正切函数，tanh函数和sigmod函数的曲线是比较相近的，咱们来比较一下看看。首先相同的是，这两个函数在输入很大或是很小的时候，输出都几乎平滑，梯度很小，不利于权重更新；不同的是输出区间，tanh的输出区间是在(-1,1)之间，而且整个函数是以0为中心的，这个特点比sigmod的好
公式
$tanh(x)=\cfrac{sinh(x)}{cosh(x)}=\cfrac{e^x-e^{-x}}{e^x+e^{-x}}$
其中：
$sinh(x)=\cfrac{e^x-e^{-x}}{2}$
$cosh(x)=\cfrac{e^x+e^{-x}}{2}$
$sin(x)=i·sinh(x)$
$cos(i·x)=cosh(x)$
求导
$f(x)'=1-f(x)^2$
优点：

比Sigmoid函数收敛速度更快
相比Sigmoid函数，其输出以0为中心

缺点：
- 还是没有改变Sigmoid函数的最大问题——由于饱和性产生的梯度消失
用法：
- 一般二分类问题中，隐藏层用tanh函数，输出层用sigmod函数

Relu函数（线性整流函数）

公式
$f(x)=max(0,x)$
求导

$f(x)'=\begin{cases} 0, x<0 \\ 1, x>0 \end{cases}$
优点
1. 在输入为正数的时候，不存在梯度饱和问题。
2. 计算速度要快很多。ReLU函数只有线性关系，不管是前向传播还是反向传播，都比sigmod和tanh要快很多。（sigmod和tanh要计算指数，计算速度会比较慢）
缺点
1. 当输入是负数的时候，ReLU是完全不被**的，这就表明一旦输入到了负数，ReLU就会死掉。这样在前向传播过程中，还不算什么问题，有的区域是敏感的，有的是不敏感的。但是到了反向传播过程中，输入负数，梯度就会完全到0，这个和sigmod函数、tanh函数有一样的问题
2. 我们发现ReLU函数的输出要么是0，要么是正数，这也就是说，ReLU函数也不是以0为中心的函数

code

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(-10,10)
y_sigmoid = 1/(1+np.exp(-x))
y_sigmoid_d = y_sigmoid*(1-y_sigmoid)
y_tanh = (np.exp(x)-np.exp(-x))/(np.exp(x)+np.exp(-x))
y_tanh_d = 1-y_tanh*y_tanh
y_relu = np.array([0*item  if item<0 else item for item in x ]) 
y_relu_d = np.array([0*item  if item<0 else 1 for item in x ]) 

fig = plt.figure()
# plot sigmoid
ax = fig.add_subplot(321)
ax.plot(x,y_sigmoid)
ax.grid()
ax.set_title('Sigmoid')

# plot sigmoid_d
ax = fig.add_subplot(322)
ax.plot(x,y_sigmoid_d)
ax.grid()
ax.set_title('sigmoid_d')

# plot tanh
ax = fig.add_subplot(323)
ax.plot(x,y_tanh)
ax.grid()
ax.set_title('y_tanh')

# plot tanh
ax = fig.add_subplot(324)
ax.plot(x,y_tanh_d)
ax.grid()
ax.set_title('y_tanh_d')

# plot relu
ax = fig.add_subplot(325)
ax.plot(x,y_relu)
ax.grid()
ax.set_title('ReLu')



# plot relu
ax = fig.add_subplot(326)
ax.plot(x,y_relu_d)
ax.grid()
ax.set_title('ReLu_d')

#plot leaky relu
# ax = fig.add_subplot(121)
# y_relu = np.array([0.2*item  if item<0 else item for item in x ]) 
# ax.plot(x,y_relu)
# ax.grid()
# ax.set_title('Leaky ReLu')
# #plot leaky relu
# ax = fig.add_subplot(122)
# y_relu = np.array([0  if item<0 else 1 for item in x ]) 
# ax.plot(x,y_relu)
# ax.grid()
# ax.set_title('Leaky ReLu d')

plt.tight_layout()
plt.savefig('att_d.jpg')
plt.show()