On Predictive Information Sub-optimality of RNNs

Dong, Zhe, Oktay, Deniz, Poole, Ben, Alemi, Alexander A.

Oct-21-2019–arXiv.org Machine Learning

Certain biological neurons demonstrate a remarkable capability to optimally compress the history of sensory inputs while being maximally informative about the future. In this work, we investigate if the same can be said of artificial neurons in recurrent neural networks (RNNs) trained with maximum likelihood. In experiments on two datasets, restorative Brownian motion and a hand-drawn sketch dataset, we find that RNNs are sub-optimal in the information plane. Instead of optimally compressing past information, they extract additional information that is not relevant for predicting the future. Overcoming this limitation may require alternative training procedures and architectures, or objectives beyond maximum likelihood estimation. Remembering past events is a critical component of predicting the future and acting in the world. An information-theoretic quantification of how much observing the past can help in predicting the future is given by the predictive information (Bialek et al., 2001). The predictive information is the mutual information (MI) between a finite set of observations (the past of a sequence) and an infinite number of additional draws from the same process (the future of a sequence).

deep learning, information, neural network, (19 more...)

arXiv.org Machine Learning

Oct-21-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.74)
    - Neural Networks > Deep Learning (0.90)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.74)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found