一、思维导图

Optimization algorithms(优化算法)---deeplearning.ai---笔记(17)

二、关键公式

(1)momentum梯度下降

$$\begin{array}{l}{{\rm{v}}_{dW}} = \beta {v_{dW}} + (1 - \beta )dW\\{{\rm{v}}_{db}} = \beta {v_{db}} + (1 - \beta )db\\W = W - \alpha {{\rm{v}}_{dW}},b = b - \alpha {{\rm{v}}_{db}}\end{array}$$

其中alpha和beta为超参数,beta取值一般为0.9

(2)RMSprop

$$\begin{array}{l}{s_{dW}} = {\beta _2}{s_{dW}} + (1 - {\beta _2})d{W^2}\\{s_{db}} = {\beta _2}{s_{db}} + (1 - {\beta _2})d{b^2}\\W = W - \alpha \frac{{dW}}{{\sqrt {{s_{dW}} + \varepsilon } }},b = b - \alpha \frac{{db}}{{\sqrt {{s_{db}} + \varepsilon } }}\end{array}$$

其中alpha和beta2为超参数.

(3)Adam

$$\begin{array}{l}{v_{dW}} = {\beta _1}{v_{dW}} + (1 - {\beta _1})dW\\{v_{db}} = {\beta _1}{v_{db}} + (1 - {\beta _1})db\\{s_{dW}} = {\beta _2}{s_{dW}} + (1 - {\beta _2})d{W^2}\\{s_{db}} = {\beta _2}{s_{db}} + (1 - {\beta _2})d{b^2}\\v_{dw}^{corrected} = {v_{dW}}/(1 - \beta _1^t)\\v_{db}^{corrected} = {v_{db}}/(1 - \beta _1^t)\\s_{dw}^{corrected} = {s_{dw}}/(1 - \beta _2^t)\\s_{db}^{corrected} = {s_{db}}/(1 - \beta _2^t)\\W = W - \alpha \frac{{v_{dw}^{corrected}}}{{\sqrt {s_{dw}^{corrected} + \varepsilon } }},b = b - \alpha \frac{{v_{db}^{corrected}}}{{\sqrt {s_{db}^{corrected} + \varepsilon } }}\end{array}$$

其中,beta1=0.9,beta2=0.999,epsilon=10^(-8).alpha为学习率,t为迭代次数。

相关文章:

  • 2021-12-09
  • 2021-06-05
  • 2021-11-23
  • 2021-06-16
  • 2021-09-29
  • 2021-05-05
  • 2022-01-13
  • 2022-02-25
猜你喜欢
  • 2021-12-25
  • 2021-05-05
  • 2021-03-31
  • 2022-12-23
  • 2021-12-29
  • 2021-09-07
  • 2021-04-19
相关资源
相似解决方案