Sigmoid

Activation Functions

Sigmoids saturate and kill gradients.

Sigmoid outputs are not zero-centered.

Exponential function is a little computational expensive.

 

Tanh

Kill gradients when saturated.

It's zero-centered! : )

 

Activation Functions

ReLU

Does not saturate. ( in positive region)

Very computational efficient.

Converges much faster than sigmoid/tanh in practice. (6 times)

Seems more biologically plausible than sigmoid.

BUT!

Not zero-centered.

No gradient when x<0.

 

Take care of learning rate when using ReLU.

 

Activation Functions

Leakly ReLU

Does not saturate.

Very computational efficient.

Converges much faster than sigmoid/tanh in practice. (6 times)

will not "die"

 

Parametric ReLU

Activation Functions

 

Exponential Linear Unit

Activation Functions

相关文章:

  • 2021-10-28
  • 2021-07-16
  • 2021-05-20
  • 2021-09-29
  • 2021-10-07
  • 2021-12-03
  • 2021-07-03
猜你喜欢
  • 2021-09-20
  • 2021-08-04
  • 2021-09-29
  • 2021-10-31
  • 2021-11-11
  • 2021-09-05
  • 2022-01-07
相关资源
相似解决方案