Distributional Reinforcement Learning for Energy-Based Sequential Models
Parshakova, Tetiana, Andreoli, Jean-Marc, Dymetman, Marc
Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models. In the first phase of training, an Energy-Based model (EBM) over sequences is derived. This EBM has high representational power, but is unnormalized and cannot be directly exploited for sampling. To address this issue [Parshakova et al., CoNLL 2019] proposes a distillation technique, which can only be applied under limited conditions. By relating this problem to Policy Gradient techniques in RL, but in a \emph{distributional} rather than \emph{optimization} perspective, we propose a general approach applicable to any sequential EBM. Its effectiveness is illustrated on GAM-based experiments.
Dec-18-2019
- Country:
- North America
- United States
- New York > New York County
- New York City (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Santa Clara County > Palo Alto (0.04)
- Los Angeles County > Long Beach (0.04)
- New York > New York County
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe > Germany
- Berlin (0.04)
- Asia > China
- Hong Kong (0.04)
- North America
- Genre:
- Research Report (0.82)
- Technology: