Link of the Paper: https://arxiv.org/abs/1411.4389

Main Points:

  1. A novel Recurrent Convolutional Architecture ( CNN + LSTM ): both Spatially and Temporally Deep.
  2. The recurrent long-term models are directly connected to modern visual convnet models and can be jointly trained to simultaneously learn temporal dynamics and convolutional perceptual representations.

Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )

Other Key Points:

  1. A significant limitation of simple RNN models which strictly integrate state information over time is known as the "vanishing gradient" effect: the ability to backpropogate an error signal through a long-range temporal interval becomes increasingly impossible in practice.
  2. The authors show LSTM-type models provide for improved recognition on conventional video activity challenges and enable a novel end-to-end optimizable mapping from image pixels to sentence-level natural language descriptions.

相关文章:

  • 2021-12-25
  • 2022-01-08
  • 2021-05-16
  • 2022-12-23
  • 2021-11-16
  • 2021-10-29
  • 2021-12-03
  • 2021-07-20
猜你喜欢
  • 2021-07-17
  • 2021-09-15
  • 2021-06-02
  • 2022-12-23
  • 2021-12-21
相关资源
相似解决方案