Goto

Collaborating Authors

 Srivastava, Prakhar


Autotelic Reinforcement Learning: Exploring Intrinsic Motivations for Skill Acquisition in Open-Ended Environments

arXiv.org Artificial Intelligence

Intelligence, which leverages sociocultural interactions to enhance open-ended skill acquisition. Artificial Intelligence (AI) aims to create autonomous agents that can operate across diverse environments and complete a wide range of tasks. Researchers pursue different approaches, each focusing on specific drivers of learning. In Reinforcement Learning (RL) [1], agents learn by exploring their environment and using their experience to solve tasks. Imitation Learning (IL) [2] involves agents learning from expert demonstrations, while Multi-Agent Reinforcement Learning (MARL) [3] emphasizes cooperation among agents to solve collaborative tasks. Recent advancements in RL have demonstrated success in varied domains, such as playing Atari games [4], mastering chess and Go [5], and controlling stratospheric balloons [6]. IL, combined with transformers [7], has enabled generalist agents to be trained on diverse datasets and to perform in-context reinforcement learning via algorithm distillation. However, these algorithms remain sample-inefficient and struggle with generalization, creativity, and tackling novel tasks, largely because they rely on isolated learning signals. This research explores sociocultural interactions as a new avenue for AI learning inspired by human development.


Probabilistic Precipitation Downscaling with Optical Flow-Guided Diffusion

arXiv.org Machine Learning

In climate science and meteorology, local precipitation predictions are limited by the immense computational costs induced by the high spatial resolution that simulation methods require. A common workaround is statistical downscaling (aka superresolution), where a low-resolution prediction is super-resolved using statistical approaches. While traditional computer vision tasks mainly focus on human perception or mean squared error, applications in weather and climate require capturing the conditional distribution of high-resolution patterns given low-resolution patterns so that reliable ensemble averages can be taken. Our approach relies on extending recent video diffusion models to precipitation superresolution: an optical flow on the high-resolution output induces temporally coherent predictions, whereas a temporally-conditioned diffusion model generates residuals that capture the correct noise characteristics and high-frequency patterns. We test our approach on X-SHiELD, an established large-scale climate simulation dataset, and compare against two state-of-the-art baselines, focusing on CRPS, MSE, precipitation distributions, as well as an illustrative case -- the complex terrain of California. Our approach sets a new standard for data-driven precipitation downscaling.


Diffusion Probabilistic Modeling for Video Generation

arXiv.org Artificial Intelligence

Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame prediction using a stochastic residual generated by an inverse diffusion process. We compare this approach against five baselines on four datasets involving natural and simulation-based videos. We find significant improvements in terms of perceptual quality for all datasets. Furthermore, by introducing a scalable version of the Continuous Ranked Probability Score (CRPS) applicable to video, we show that our model also outperforms existing approaches in their probabilistic frame forecasting ability.