Goto

Collaborating Authors

 sec








Temporal Regularization for Markov Decision Process

Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup

Neural Information Processing Systems

Yetinreinforcementlearning,duetothenatureofthe Bellman equation, there isanopportunity toalsoexploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization.


Exponentially Weighted Imitation Learning for Batched Historical Data

Qing Wang, Jiechao Xiong, Lei Han, peng sun, Han Liu, Tong Zhang

Neural Information Processing Systems

We consider deep policy learning with only batched historical trajectories. The main challenge of this problem is that the learner no longer has a simulator or "environment oracle" as in most reinforcement learning settings.


Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks

Yusuke Tsuzuku, Issei Sato, Masashi Sugiyama

Neural Information Processing Systems

Adversarial training [10, 16, 18], which injects adversarially perturbed dataintotraining data,isapromising approach. Many other heuristics have been developed to make neural networks insensitive against small perturbations on inputs.