Goto

Collaborating Authors

 Model-Based Reasoning


A Physics-Informed Machine Learning Framework for Safe and Optimal Control of Autonomous Systems

arXiv.org Artificial Intelligence

As autonomous systems become more ubiquitous in daily life, ensuring high performance with guaranteed safety is crucial. However, safety and performance could be competing objectives, which makes their co-optimization difficult. Learning-based methods, such as Constrained Reinforcement Learning (CRL), achieve strong performance but lack formal safety guarantees due to safety being enforced as soft constraints, limiting their use in safety-critical settings. Conversely, formal methods such as Hamilton-Jacobi (HJ) Reachability Analysis and Control Barrier Functions (CBFs) provide rigorous safety assurances but often neglect performance, resulting in overly conservative controllers. To bridge this gap, we formulate the co-optimization of safety and performance as a state-constrained optimal control problem, where performance objectives are encoded via a cost function and safety requirements are imposed as state constraints. We demonstrate that the resultant value function satisfies a Hamilton-Jacobi-Bellman (HJB) equation, which we approximate efficiently using a novel physics-informed machine learning framework. In addition, we introduce a conformal prediction-based verification strategy to quantify the learning errors, recovering a high-confidence safety value function, along with a probabilistic error bound on performance degradation. Through several case studies, we demonstrate the efficacy of the proposed framework in enabling scalable learning of safe and performant controllers for complex, high-dimensional autonomous systems.


Ten Challenging Problems in Federated Foundation Models

arXiv.org Artificial Intelligence

Federated Foundation Models (FedFMs) represent a distributed learning paradigm that fuses general competences of foundation models as well as privacy-preserving capabilities of federated learning. This combination allows the large foundation models and the small local domain models at the remote clients to learn from each other in a teacher-student learning setting. This paper provides a comprehensive summary of the ten challenging problems inherent in FedFMs, encompassing foundational theory, utilization of private data, continual learning, unlearning, Non-IID and graph data, bidirectional knowledge transfer, incentive mechanism design, game mechanism design, model watermarking, and efficiency. The ten challenging problems manifest in five pivotal aspects: ``Foundational Theory," which aims to establish a coherent and unifying theoretical framework for FedFMs. ``Data," addressing the difficulties in leveraging domain-specific knowledge from private data while maintaining privacy; ``Heterogeneity," examining variations in data, model, and computational resources across clients; ``Security and Privacy," focusing on defenses against malicious attacks and model theft; and ``Efficiency," highlighting the need for improvements in training, communication, and parameter efficiency. For each problem, we offer a clear mathematical definition on the objective function, analyze existing methods, and discuss the key challenges and potential solutions. This in-depth exploration aims to advance the theoretical foundations of FedFMs, guide practical implementations, and inspire future research to overcome these obstacles, thereby enabling the robust, efficient, and privacy-preserving FedFMs in various real-world applications.


ClimSim: Supplementary Information

Neural Information Processing Systems

Climate models divide the Earth's atmosphere, land surface, and ocean into a 3D grid, creating a discretized representation of the planet. Earth system models are made up of independent component models for the atmosphere, land surface, rivers, ocean, sea ice, and glaciers. When running as a fully coupled system the "component coupler" handles the flow of data between the components. Within each grid cell of the component models, a series of complex calculations are performed to account for various physical processes, such as phase changes of water, radiative heat transfer, and dynamic transport (referred to as "advection"). Each component model uses the discretized values of many quantities (such as temperature, humidity, and wind speed) as inputs to parameterizations and fluid solvers to output those same values for a future point in time. The atmosphere and ocean components are the most expensive pieces of an Earth system model, which is largely due to the computation and inter-process communication associated with their fluid dynamics solvers. Furthermore, a significant portion of the overall cost is attributed to the atmospheric physics calculations that are performed locally within each grid column. It is important to note that atmospheric physics serves as a major source of uncertainty in climate projections, primarily stemming from the challenges associated with accurately representing cloud and aerosol processes. Traditionally, global atmospheric models parameterize clouds and turbulence using crude, low-order models that attempt to represent the aggregate effects of these processes on larger scales. However, the complexity and nonlinearity of cloud and rainfall processes make them particularly challenging to represent accurately with parameterizations. The MMF approach replaces these conventional parameterizations with a cloud resolving model (CRM) in each cell of the global grid, so that cloud and turbulence can be explicitly represented. Each of these independent CRMs is spatially fixed and exchange coupling tendencies with a parent global grid column. This novel approach to representing clouds and turbulence can improve various aspects of the simulated climate, such as rainfall patterns [2].


Review for NeurIPS paper: Discovering Symbolic Models from Deep Learning with Inductive Biases

Neural Information Processing Systems

Paper presents an exciting area of research. All reviewers agree that the paper makes novel contributions. The one weak point of the current submission is that this work is not properly contextualized with prior work. Further, as authors said in their rebuttal -- it would be good to see comparisons with other SR packages and SR only baseline.


Sample Complexity of Automated Mechanism Design

Neural Information Processing Systems

The design of revenue-maximizing combinatorial auctions, i.e. multi item auctions over bundles of goods, is one of the most fundamental problems in computational economics, unsolved even for two bidders and two items for sale. In the traditional economic models, it is assumed that the bidders' valuations are drawn from an underlying distribution and that the auction designer has perfect knowledge of this distribution. Despite this strong and oftentimes unrealistic assumption, it is remarkable that the revenue-maximizing combinatorial auction remains unknown. In recent years, automated mechanism design has emerged as one of the most practical and promising approaches to designing high-revenue combinatorial auctions. The most scalable automated mechanism design algorithms take as input samples from the bidders' valuation distribution and then search for a high-revenue auction in a rich auction class.


Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Neural Information Processing Systems

Pre-trained machine learning (ML) models have shown great performance for a wide range of applications, in particular in natural language processing (NLP) and computer vision (CV). Here, we study how pre-training could be used for scientific machine learning (SciML) applications, specifically in the context of transfer learning. We study the transfer behavior of these models as (i) the pretrained model size is scaled, (ii) the downstream training dataset size is scaled, (iii) the physics parameters are systematically pushed out of distribution, and (iv) how a single model pre-trained on a mixture of different physics problems can be adapted to various downstream applications. We find that--when fine-tuned appropriately--transfer learning can help reach desired accuracy levels with orders of magnitude fewer downstream examples (across different tasks that can even be out-of-distribution) than training from scratch, with consistent behaviour across a wide range of downstream examples. We also find that fine-tuning these models yields more performance gains as model size increases, compared to training from scratch on new downstream tasks. These results hold for a broad range of PDE learning tasks. All in all, our results demonstrate the potential of the "pre-train and fine-tune" paradigm for SciML problems, demonstrating a path towards building SciML foundation models. Our code is available as open-source at [1].




PETAL Physics Emulation Through Averaged for Solving Inverse Problems

Neural Information Processing Systems

Inverse problems describe the task of recovering an underlying signal of interest given observables. Typically, the observables are related via some non-linear forward model applied to the underlying unknown signal. Inverting the non-linear forward model can be computationally expensive, as it often involves computing and inverting a linearization at a series of estimates. Rather than inverting the physics-based model, we instead train a surrogate forward model (emulator) and leverage modern auto-grad libraries to solve for the input within a classical optimization framework. Current methods to train emulators are done in a black box supervised machine learning fashion and fail to take advantage of any existing knowledge of the forward model. In this article, we propose a simple learned weighted average model that embeds linearizations of the forward model around various reference points into the model itself, explicitly incorporating known physics. Grounding the learned model with physics based linearizations improves the forward modeling accuracy and provides richer physics based gradient information during the inversion process leading to more accurate signal recovery. We demonstrate the efficacy on an ocean acoustic tomography (OAT) example that aims to recover ocean sound speed profile (SSP) variations from acoustic observations (e.g.


Domain Agnostic Fourier Neural Operators

Neural Information Processing Systems

Fourier neural operators (FNOs) can learn highly nonlinear mappings between function spaces, and have recently become a popular tool for learning responses of complex physical systems. However, to achieve good accuracy and efficiency, FNOs rely on the Fast Fourier transform (FFT), which is restricted to modeling problems on rectangular domains. To lift such a restriction and permit FFT on irregular geometries as well as topology changes, we introduce domain agnostic Fourier neural operator (DAFNO), a novel neural operator architecture for learning surrogates with irregular geometries and evolving domains. The key idea is to incorporate a smoothed characteristic function in the integral layer architecture of FNOs, and leverage FFT to achieve rapid computations, in such a way that the geometric information is explicitly encoded in the architecture. In our empirical evaluation, DAFNO has achieved state-of-the-art accuracy as compared to baseline neural operator models on two benchmark datasets of material modeling and airfoil simulation. To further demonstrate the capability and generalizability of DAFNO in handling complex domains with topology changes, we consider a brittle material fracture evolution problem. With only one training crack simulation sample, DAFNO has achieved generalizability to unseen loading scenarios and substantially different crack patterns from the trained scenario.