Goto

Collaborating Authors

 Wurman, Peter


Event Tables for Efficient Experience Replay

arXiv.org Artificial Intelligence

Experience replay (ER) is a crucial component of many deep reinforcement learning (RL) systems. However, uniform sampling from an ER buffer can lead to slow convergence and unstable asymptotic behaviors. This paper introduces Stratified Sampling from Event Tables (SSET), which partitions an ER buffer into Event Tables, each capturing important subsequences of optimal behavior. We prove a theoretical advantage over the traditional monolithic buffer approach and combine SSET with an existing prioritized sampling strategy to further improve learning speed and stability. Empirical results in challenging MiniGrid domains, benchmark RL environments, and a high-fidelity car racing simulator demonstrate the advantages and versatility of SSET over existing ER buffer sampling approaches.


Reinforcement Learning for Optimization of COVID-19 Mitigation policies

arXiv.org Artificial Intelligence

The year 2020 has seen the COVID-19 virus lead to one of the worst global pandemics in history. As a result, governments around the world are faced with the challenge of protecting public health, while keeping the economy running to the greatest extent possible. Epidemiological models provide insight into the spread of these types of diseases and predict the effects of possible intervention policies. However, to date,the even the most data-driven intervention policies rely on heuristics. In this paper, we study how reinforcement learning (RL) can be used to optimize mitigation policies that minimize the economic impact without overwhelming the hospital capacity. Our main contributions are (1) a novel agent-based pandemic simulator which, unlike traditional models, is able to model fine-grained interactions among people at specific locations in a community; and (2) an RL-based methodology for optimizing fine-grained mitigation policies within this simulator. Our results validate both the overall simulator behavior and the learned policies under realistic conditions.


The Amazon Picking Challenge

AI Magazine

The APC's focus is on one core -- but extremely important -- area of robotic competency: manipulating objects in the world. The competition scenario was a Kivalike warehouse in which the robot had 20 minutes to pick items off a shelf and put them into a plastic tote. The 12 bins on the shelf were stocked with 25 products that posed a range of perception or manipulation challenges. Each bin had one target item. A robot received a base score of 10 points for successfully picking the target item, with bonus points for cluttered bins or difficult items.