mcb
Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control
Alzorgan, Hazim, Razi, Abolfazl
Actor-critic methods, like Twin Delayed Deep Deterministic Policy Gradient (TD3), depend on basic noise-based exploration, which can result in less than optimal policy convergence. In this study, we introduce Monte Carlo Beam Search (MCBS), a new hybrid method that combines beam search and Monte Carlo rollouts with TD3 to improve exploration and action selection. MCBS produces several candidate actions around the policy's output and assesses them through short-horizon rollouts, enabling the agent to make better-informed choices. We test MCBS across various continuous-control benchmarks, including HalfCheetah-v4, Walker2d-v5, and Swimmer-v5, showing enhanced sample efficiency and performance compared to standard TD3 and other baseline methods like SAC, PPO, and A2C. Our findings emphasize MCBS's capability to enhance policy learning through structured look-ahead search while ensuring computational efficiency. Additionally, we offer a detailed analysis of crucial hyperparameters, such as beam width and rollout depth, and explore adaptive strategies to optimize MCBS for complex control tasks. Our method shows a higher convergence rate across different environments compared to TD3, SAC, PPO, and A2C. For instance, we achieved 90% of the maximum achievable reward within around 200 thousand timesteps compared to 400 thousand timesteps for the second-best method.
Extracting the Multiscale Causal Backbone of Brain Dynamics
D'Acunto, Gabriele, Bonchi, Francesco, Morales, Gianmarco De Francisci, Petri, Giovanni
The bulk of the research effort on brain connectivity revolves around statistical associations among brain regions, which do not directly relate to the causal mechanisms governing brain dynamics. Here we propose the multiscale causal backbone (MCB) of brain dynamics shared by a set of individuals across multiple temporal scales, and devise a principled methodology to extract it. Our approach leverages recent advances in multiscale causal structure learning and optimizes the trade-off between the model fitting and its complexity. Empirical assessment on synthetic data shows the superiority of our methodology over a baseline based on canonical functional connectivity networks. When applied to resting-state fMRI data, we find sparse MCBs for both the left and right brain hemispheres. Thanks to its multiscale nature, our approach shows that at low-frequency bands, causal dynamics are driven by brain regions associated with high-level cognitive functions; at higher frequencies instead, nodes related to sensory processing play a crucial role. Finally, our analysis of individual multiscale causal structures confirms the existence of a causal fingerprint of brain connectivity, thus supporting from a causal perspective the existing extensive research in brain connectivity fingerprinting.
Climate Intervention Analysis using AI Model Guided by Statistical Physics Principles
Kim, Soo Kyung, Ramea, Kalai, Cachay, Salva Rรผhling, Hirasawa, Haruki, Hazarika, Subhashis, Hingmire, Dipti, Mitra, Peetak, Rasch, Philip J., Singh, Hansi A.
The availability of training data remains a significant obstacle for the implementation of machine learning in scientific applications. In particular, estimating how a system might respond to external forcings or perturbations requires specialized labeled data or targeted simulations, which may be computationally intensive to generate at scale. In this study, we propose a novel solution to this challenge by utilizing a principle from statistical physics known as the Fluctuation-Dissipation Theorem (FDT) to discover knowledge using an AI model that can rapidly produce scenarios for different external forcings. By leveraging FDT, we are able to extract information encoded in a large dataset produced by Earth System Models, which includes 8250 years of internal climate fluctuations, to estimate the climate system's response to forcings. Our model, AiBEDO, is capable of capturing the complex, multi-timescale effects of radiation perturbations on global and regional surface climate, allowing for a substantial acceleration of the exploration of the impacts of spatially-heterogenous climate forcers. To demonstrate the utility of AiBEDO, we use the example of a climate intervention technique called Marine Cloud Brightening, with the ultimate goal of optimizing the spatial pattern of cloud brightening to achieve regional climate targets and prevent known climate tipping points. While we showcase the effectiveness of our approach in the context of climate science, it is generally applicable to other scientific disciplines that are limited by the extensive computational demands of domain simulation models. Source code of AiBEDO framework is made available at https://github.com/kramea/kdd_aibedo. A sample dataset is made available at https://doi.org/10.5281/zenodo.7597027. Additional data available upon request.
Accelerating exploration of Marine Cloud Brightening impacts on tipping points Using an AI Implementation of Fluctuation-Dissipation Theorem
Hirasawa, Haruki, Kim, Sookyung, Mitra, Peetak, Hazarika, Subhashis, Ruhling-Cachay, Salva, Hingmire, Dipti, Ramea, Kalai, Singh, Hansi, Rasch, Philip J.
Marine cloud brightening (MCB) is a proposed climate intervention technology to partially offset greenhouse gas warming and possibly avoid crossing climate tipping points. The impacts of MCB on regional climate are typically estimated using computationally expensive Earth System Model (ESM) simulations, preventing a thorough assessment of the large possibility space of potential MCB interventions. Here, we describe an AI model, named AiBEDO, that can be used to rapidly projects climate responses to forcings via a novel application of the Fluctuation-Dissipation Theorem (FDT). AiBEDO is a Multilayer Perceptron (MLP) model that uses maps monthly-mean radiation anomalies to surface climate anomalies at a range of time lags. By leveraging a large existing dataset of ESM simulations containing internal climate noise, we use AiBEDO to construct an FDT operator that successfully projects climate responses to MCB forcing, when evaluated against ESM simulations. We propose that AiBEDO-FDT can be used to optimize MCB forcing patterns to reduce tipping point risks while minimizing negative side effects in other parts of the climate.
Incremental cycle bases for cycle-based pose graph optimization
Forsgren, Brendon, Brink, Kevin, Ganesh, Prashant, McLain, Timothy
Pose graph optimization is a special case of the simultaneous localization and mapping problem where the only variables to be estimated are pose variables and the only measurements are inter-pose constraints. The vast majority of pose graph optimization techniques are vertex based (variables are robot poses), but recent work has parameterized the pose graph optimization problem in a relative fashion (variables are the transformations between poses) that utilizes a minimum cycle basis to maximize the sparsity of the problem. We explore the construction of a cycle basis in an incremental manner while maximizing the sparsity. We validate an algorithm that constructs a sparse cycle basis incrementally and compare its performance with a minimum cycle basis. Additionally, we present an algorithm to approximate the minimum cycle basis of two graphs that are sparsely connected as is common in multi-agent scenarios. Lastly, the relative parameterization of pose graph optimization has been limited to using rigid body transforms on SE(2) or SE(3) as the constraints between poses. We introduce a methodology to allow for the use of lower-degree-of-freedom measurements in the relative pose graph optimization problem. We provide extensive validation of our algorithms on standard benchmarks, simulated datasets, and custom hardware.
Tractability of Theory Patching
Argamon-Engelson, S., Koppel, M.
In this paper we consider the problem of `theory patching', in which we are given a domain theory, some of whose components are indicated to be possibly flawed, and a set of labeled training examples for the domain concept. The theory patching problem is to revise only the indicated components of the theory, such that the resulting theory correctly classifies all the training examples. Theory patching is thus a type of theory revision in which revisions are made to individual components of the theory. Our concern in this paper is to determine for which classes of logical domain theories the theory patching problem is tractable. We consider both propositional and first-order domain theories, and show that the theory patching problem is equivalent to that of determining what information contained in a theory is `stable' regardless of what revisions might be performed to the theory. We show that determining stability is tractable if the input theory satisfies two conditions: that revisions to each theory component have monotonic effects on the classification of examples, and that theory components act independently in the classification of examples in the theory. We also show how the concepts introduced can be used to determine the soundness and completeness of particular theory patching algorithms.