Goto

Collaborating Authors

 Reinforcement Learning


Average-Reward Learning and Planning with Options Yi Wan, Abhishek Naik, Richard S. Sutton {wan6,anaik1,rsutton }@ualberta.ca University of Alberta, Amii

Neural Information Processing Systems

We extend the options framework for temporal abstraction in reinforcement learning from discounted Markov decision processes (MDPs) to average-reward MDPs. Our contributions include general convergent off-policy inter-option learning algorithms, intra-option algorithms for learning values and models, as well as sample-based planning variants of our learning algorithms. Our algorithms and convergence proofs extend those recently developed by Wan, Naik, and Sutton.




MaskPlace: Fast Chip Placement via Reinforced Visual Representation Learning Y ao Lai Y ao Mu Ping Luo Department of Computer Science The University of Hong Kong {ylai,ymu,pluo }@cs.hku.hk

Neural Information Processing Systems

Firstly, MaskPlace recasts placement as a problem of learning pixel-level visual representation to comprehensively describe millions of modules on a chip, enabling placement in a high-resolution canvas and a large action space. It outperforms recent methods that represent a chip as a hypergraph.



Supplementary: Reinforcement Learning Enhanced Explainer for Graph Neural Networks Caihua Shan

Neural Information Processing Systems

(line 4). We show our RG-Explainer for graph classification in Alg. 2. The algorithm is similar to the one explaining node classifications, except that we train our seed locator to detect the most influential (line 4). Input: The input graph G = ( V, E), node features X, node instances I, and a trained GNN model f () . Check the stopping criteria by Eq. 10. I, and a trained GNN model f () .