Goto

Collaborating Authors

 Markov Models



Structured Energy Network as a Loss Function Jay-Y oon Lee

Neural Information Processing Systems

Belanger & McCallum (2016) and Gygli et al. (2017) have shown that energy In this work, we propose Structured Energy As Loss (SEAL) to take advantage of the expressivity of energy networks without incurring the high inference cost. This raises a question: Can energy networks be used in a way that is as expressive as SPENs, as efficient at inference as feedforward approaches, and also easy to train?




A The Estimator null A X W)

Neural Information Processing Systems

A.2 Proof of Theorem 1 To prove Theorem 1, we assume that G Proof of Lemma 1. Let's first rewrite Equation (4) as null null By Lemma 1, linearity of expectation and knowing that each RWT is independent from the other tours by the Strong Markov Property, Theorem 1 holds. MHM-GNN can recover edge-based models where representations don't use graph-wide However, on Rent the Runway we see the raw features achieving the highest performance. That is, structural information does not seem to be relevant to this specific task. All hyperparameters were chosen to minimize training loss. For k = 5, we used a minibatch of size 5 in all datasets.



Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

Neural Information Processing Systems

There have been many recent advances on provably efficient Reinforcement Learning (RL) in problems with rich observation spaces. However, all these works share a strong realizability assumption about the optimal value function of the true MDP . Such realizability assumptions are often too strong to hold in practice. In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies Π that may not contain any near-optimal policy. We provide an algorithm for this setting whose error is bounded in terms of the rank d of the underlying MDP .