Srivastava, Prakhar
Autotelic Reinforcement Learning: Exploring Intrinsic Motivations for Skill Acquisition in Open-Ended Environments
Srivastava, Prakhar, Singh, Jasmeet
Intelligence, which leverages sociocultural interactions to enhance open-ended skill acquisition. Artificial Intelligence (AI) aims to create autonomous agents that can operate across diverse environments and complete a wide range of tasks. Researchers pursue different approaches, each focusing on specific drivers of learning. In Reinforcement Learning (RL) [1], agents learn by exploring their environment and using their experience to solve tasks. Imitation Learning (IL) [2] involves agents learning from expert demonstrations, while Multi-Agent Reinforcement Learning (MARL) [3] emphasizes cooperation among agents to solve collaborative tasks. Recent advancements in RL have demonstrated success in varied domains, such as playing Atari games [4], mastering chess and Go [5], and controlling stratospheric balloons [6]. IL, combined with transformers [7], has enabled generalist agents to be trained on diverse datasets and to perform in-context reinforcement learning via algorithm distillation. However, these algorithms remain sample-inefficient and struggle with generalization, creativity, and tackling novel tasks, largely because they rely on isolated learning signals. This research explores sociocultural interactions as a new avenue for AI learning inspired by human development.
Probabilistic Precipitation Downscaling with Optical Flow-Guided Diffusion
Srivastava, Prakhar, Yang, Ruihan, Kerrigan, Gavin, Dresdner, Gideon, McGibbon, Jeremy, Bretherton, Christopher, Mandt, Stephan
In climate science and meteorology, local precipitation predictions are limited by the immense computational costs induced by the high spatial resolution that simulation methods require. A common workaround is statistical downscaling (aka superresolution), where a low-resolution prediction is super-resolved using statistical approaches. While traditional computer vision tasks mainly focus on human perception or mean squared error, applications in weather and climate require capturing the conditional distribution of high-resolution patterns given low-resolution patterns so that reliable ensemble averages can be taken. Our approach relies on extending recent video diffusion models to precipitation superresolution: an optical flow on the high-resolution output induces temporally coherent predictions, whereas a temporally-conditioned diffusion model generates residuals that capture the correct noise characteristics and high-frequency patterns. We test our approach on X-SHiELD, an established large-scale climate simulation dataset, and compare against two state-of-the-art baselines, focusing on CRPS, MSE, precipitation distributions, as well as an illustrative case -- the complex terrain of California. Our approach sets a new standard for data-driven precipitation downscaling.
Diffusion Probabilistic Modeling for Video Generation
Yang, Ruihan, Srivastava, Prakhar, Mandt, Stephan
Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame prediction using a stochastic residual generated by an inverse diffusion process. We compare this approach against five baselines on four datasets involving natural and simulation-based videos. We find significant improvements in terms of perceptual quality for all datasets. Furthermore, by introducing a scalable version of the Continuous Ranked Probability Score (CRPS) applicable to video, we show that our model also outperforms existing approaches in their probabilistic frame forecasting ability.