Goto

Collaborating Authors

 deepreinforcementlearni...



AgentModellingunderPartialObservabilityfor DeepReinforcementLearning

Neural Information Processing Systems

Existing methods for agent modelling commonly assume knowledge of the local observations and chosen actions of the modelled agents during execution. To eliminate this assumption, we extract representations from thelocalinformation ofthecontrolled agent using encoderdecoderarchitectures.


TowardsPlayingFullMOBAGameswith DeepReinforcementLearning

Neural Information Processing Systems

As aresult, full MOBAgames without restrictions are farfrom being mastered by any existing AI system. In this paper, we propose a MOBA AIlearning paradigm that methodologically enables playing full MOBAgames withdeepreinforcementlearning.Specifically,wedevelopacombinationofnovel and existing learning techniques, including curriculum self-play learning, policy distillation, off-policy adaption, multi-head value estimation, and Monte-Carlo tree-search, intraining andplaying alargepoolofheroes,meanwhile addressing thescalabilityissueskillfully.


Multi-fidelity Reinforcement Learning Control for Complex Dynamical Systems

Sun, Luning, Liu, Xin-Yang, Zhao, Siyan, Grover, Aditya, Wang, Jian-Xun, Thiagarajan, Jayaraman J.

arXiv.org Artificial Intelligence

Controlling instabilities in complex dynamical systems is challenging in scientific and engineering applications. Deep reinforcement learning (DRL) has seen promising results for applications in different scientific applications. The many-query nature of control tasks requires multiple interactions with real environments of the underlying physics. However, it is usually sparse to collect from the experiments or expensive to simulate for complex dynamics. Alternatively, controlling surrogate modeling could mitigate the computational cost issue. However, a fast and accurate learning-based model by offline training makes it very hard to get accurate pointwise dynamics when the dynamics are chaotic. To bridge this gap, the current work proposes a multi-fidelity reinforcement learning (MFRL) framework that leverages differentiable hybrid models for control tasks, where a physics-based hybrid model is corrected by limited high-fidelity data. We also proposed a spectrum-based reward function for RL learning. The effect of the proposed framework is demonstrated on two complex dynamics in physics. The statistics of the MFRL control result match that computed from many-query evaluations of the high-fidelity environments and outperform other SOTA baselines.