Goto

Collaborating Authors

 abstract state







SLAP: Shortcut Learning for Abstract Planning

Liu, Y. Isabel, Li, Bowen, Eysenbach, Benjamin, Silver, Tom

arXiv.org Artificial Intelligence

Long-horizon decision-making with sparse rewards and continuous states and actions remains a fundamental challenge in AI and robotics. Task and motion planning (TAMP) is a model-based framework that addresses this challenge by planning hierarchically with abstract actions (options). These options are manually defined, limiting the agent to behaviors that we as human engineers know how to program (pick, place, move). In this work, we propose Shortcut Learning for Abstract Planning (SLAP), a method that leverages existing TAMP options to automatically discover new ones. Our key idea is to use model-free reinforcement learning (RL) to learn shortcuts in the abstract planning graph induced by the existing options in TAMP. Without any additional assumptions or inputs, shortcut learning leads to shorter solutions than pure planning, and higher task success rates than flat and hierarchical RL. Qualitatively, SLAP discovers dynamic physical improvisations (e.g., slap, wiggle, wipe) that differ significantly from the manually-defined ones. In experiments in four simulated robotic environments, we show that SLAP solves and generalizes to a wide range of tasks, reducing overall plan lengths by over 50% and consistently outperforming planning and RL baselines.



Boosting Verification of Deep Reinforcement Learning via Piece-wise Linear Decision Neural Networks

Neural Information Processing Systems

In particular, DNN-specific overestimation is extremely unpredictable due to many factors such as the dimension of system states, the complexity of environment dynamics, and the size, weight, and activation function of a neural network.



Non-Termination Proving: 100 Million LoC and Beyond

Vanegue, Julien, Villard, Jules, O'Hearn, Peter, Raad, Azalea

arXiv.org Artificial Intelligence

We report on our tool, Pulse Infinite, that uses proof techniques to show non-termination (divergence) in large programs. Pulse Infinite works compositionally and under-approximately: the former supports scale, and the latter ensures soundness for proving divergence. Prior work focused on small benchmarks in the tens or hundreds of lines of code (LoC), and scale limits their practicality: a single company may have tens of millions, or even hundreds of millions of LoC or more. We report on applying Pulse Infinite to over a hundred million lines of open-source and proprietary software written in C, C++, and Hack, identifying over 30 previously unknown issues, establishing a new state of the art for detecting divergence in real-world codebases.