Credit Assignment Techniques in Stochastic Computation Graphs

Weber, Théophane, Heess, Nicolas, Buesing, Lars, Silver, David

Jan-7-2019–arXiv.org Machine Learning

Stochastic computation graphs (SCGs) provide a formalism to represent structured optimization problems arising in artificial intelligence, including supervised, unsupervised, and reinforcement learning. Previous work has shown that an unbiased estimator of the gradient of the expected loss of SCGs can be derived from a single principle. However, this estimator often has high variance and requires a full model evaluation per data point, making this algorithm costly in large graphs. In this work, we address these problems by generalizing concepts from the reinforcement learning literature. We introduce the concepts of value functions, baselines and critics for arbitrary SCGs, and show how to use them to derive lower-variance gradient estimates from partial model evaluations, paving the way towards general and efficient credit assignment for gradient-based optimization. In doing so, we demonstrate how our results unify recent advances in the probabilistic inference and reinforcement learning literature.

deep learning, neural network, value function, (20 more...)

arXiv.org Machine Learning

Jan-7-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.27)

Genre:
- Research Report (0.84)

Industry:
- Education (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models
      - Directed Networks > Bayesian Learning (0.67)
      - Undirected Networks > Markov Models (0.68)
    - Neural Networks > Deep Learning (0.45)
    - Reinforcement Learning (1.00)
    - Statistical Learning > Gradient Descent (0.46)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found