Goto

Collaborating Authors

 Reinforcement Learning





Learning Diverse Policies in MOBA Games via Macro-Goals

Neural Information Processing Systems

Recently, many researchers have made successful progress in building the AI systems for MOBA-game-playing with deep reinforcement learning, such as on Dota 2 and Honor of Kings .





On the Sample Complexity of Stabilizing LTI Systems on a Single Trajectory

Neural Information Processing Systems

Stabilizing an unknown dynamical system is one of the central problems in control theory. In this paper, we study the sample complexity of the learn-to-stabilize problem in Linear Time-Invariant (L TI) systems on a single trajectory. Current state-of-the-art approaches require a sample complexity linear in n, the state dimension, which incurs a state norm that blows up exponentially in n. We propose a novel algorithm based on spectral decomposition that only needs to learn "a small part" of the dynamical matrix acting on its unstable subspace. We show that, under proper assumptions, our algorithm stabilizes an L TI system on a single trajectory with O ( k log n) samples, where k is the instability index of the system. This represents the first sub-linear sample complexity result for the stabilization of L TI systems under the regime when k = o (n).



When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

Neural Information Processing Systems

A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world. A critical challenge to such autonomy is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table.