Model-Based Reasoning
CausalMan: A physics-based simulator for large-scale causality
Tagliapietra, Nicholas, Luettin, Juergen, Halilaj, Lavdim, Willig, Moritz, Pychynski, Tim, Kersting, Kristian
A comprehensive understanding of causality is critical for navigating and operating within today's complex real-world systems. The absence of realistic causal models with known data generating processes complicates fair benchmarking. In this paper, we present the CausalMan simulator, modeled after a real-world production line. The simulator features a diverse range of linear and non-linear mechanisms and challenging-to-predict behaviors, such as discrete mode changes. We demonstrate the inadequacy of many state-of-the-art approaches and analyze the significant differences in their performance and tractability, both in terms of runtime and memory complexity. As a contribution, we will release the CausalMan large-scale simulator. We present two derived datasets, and perform an extensive evaluation of both.
Scientific Machine Learning of Flow Resistance Using Universal Shallow Water Equations with Differentiable Programming
Shallow water equations (SWEs) are the backbone of most hydrodynamics models for flood prediction, river engineering, and many other water resources applications. The estimation of flow resistance, i.e., the Manning's roughness coefficient $n$, is crucial for ensuring model accuracy, and has been previously determined using empirical formulas or tables. To better account for temporal and spatial variability in channel roughness, inverse modeling of $n$ using observed flow data is more reliable and adaptable; however, it is challenging when using traditional SWE solvers. Based on the concept of universal differential equation (UDE), which combines physics-based differential equations with neural networks (NNs), we developed a universal SWEs (USWEs) solver, Hydrograd, for hybrid hydrodynamics modeling. It can do accurate forward simulations, support automatic differentiation (AD) for gradient-based sensitivity analysis and parameter inversion, and perform scientific machine learning for physics discovery. In this work, we first validated the accuracy of its forward modeling, then applied a real-world case to demonstrate the ability of USWEs to capture model sensitivity (gradients) and perform inverse modeling of Manning's $n$. Furthermore, we used a NN to learn a universal relationship between $n$, hydraulic parameters, and flow in a real river channel. Unlike inverse modeling using surrogate models, Hydrograd uses a two-dimensional SWEs solver as its physics backbone, which eliminates the need for data-intensive pretraining and resolves the generalization problem when applied to out-of-sample scenarios. This differentiable modeling approach, with seamless integration with NNs, provides a new pathway for solving complex inverse problems and discovering new physics in hydrodynamics.
A Physics-Informed Machine Learning Framework for Safe and Optimal Control of Autonomous Systems
Tayal, Manan, Singh, Aditya, Kolathaya, Shishir, Bansal, Somil
As autonomous systems become more ubiquitous in daily life, ensuring high performance with guaranteed safety is crucial. However, safety and performance could be competing objectives, which makes their co-optimization difficult. Learning-based methods, such as Constrained Reinforcement Learning (CRL), achieve strong performance but lack formal safety guarantees due to safety being enforced as soft constraints, limiting their use in safety-critical settings. Conversely, formal methods such as Hamilton-Jacobi (HJ) Reachability Analysis and Control Barrier Functions (CBFs) provide rigorous safety assurances but often neglect performance, resulting in overly conservative controllers. To bridge this gap, we formulate the co-optimization of safety and performance as a state-constrained optimal control problem, where performance objectives are encoded via a cost function and safety requirements are imposed as state constraints. We demonstrate that the resultant value function satisfies a Hamilton-Jacobi-Bellman (HJB) equation, which we approximate efficiently using a novel physics-informed machine learning framework. In addition, we introduce a conformal prediction-based verification strategy to quantify the learning errors, recovering a high-confidence safety value function, along with a probabilistic error bound on performance degradation. Through several case studies, we demonstrate the efficacy of the proposed framework in enabling scalable learning of safe and performant controllers for complex, high-dimensional autonomous systems.
Ten Challenging Problems in Federated Foundation Models
Fan, Tao, Gu, Hanlin, Cao, Xuemei, Chan, Chee Seng, Chen, Qian, Chen, Yiqiang, Feng, Yihui, Gu, Yang, Geng, Jiaxiang, Luo, Bing, Liu, Shuoling, Ong, Win Kent, Ren, Chao, Shao, Jiaqi, Sun, Chuan, Tang, Xiaoli, Tae, Hong Xi, Tong, Yongxin, Wei, Shuyue, Wu, Fan, Xi, Wei, Xu, Mingcong, Yang, He, Yang, Xin, Yan, Jiangpeng, Yu, Hao, Yu, Han, Zhang, Teng, Zhang, Yifei, Zhang, Xiaojin, Zheng, Zhenzhe, Fan, Lixin, Yang, Qiang
Federated Foundation Models (FedFMs) represent a distributed learning paradigm that fuses general competences of foundation models as well as privacy-preserving capabilities of federated learning. This combination allows the large foundation models and the small local domain models at the remote clients to learn from each other in a teacher-student learning setting. This paper provides a comprehensive summary of the ten challenging problems inherent in FedFMs, encompassing foundational theory, utilization of private data, continual learning, unlearning, Non-IID and graph data, bidirectional knowledge transfer, incentive mechanism design, game mechanism design, model watermarking, and efficiency. The ten challenging problems manifest in five pivotal aspects: ``Foundational Theory," which aims to establish a coherent and unifying theoretical framework for FedFMs. ``Data," addressing the difficulties in leveraging domain-specific knowledge from private data while maintaining privacy; ``Heterogeneity," examining variations in data, model, and computational resources across clients; ``Security and Privacy," focusing on defenses against malicious attacks and model theft; and ``Efficiency," highlighting the need for improvements in training, communication, and parameter efficiency. For each problem, we offer a clear mathematical definition on the objective function, analyze existing methods, and discuss the key challenges and potential solutions. This in-depth exploration aims to advance the theoretical foundations of FedFMs, guide practical implementations, and inspire future research to overcome these obstacles, thereby enabling the robust, efficient, and privacy-preserving FedFMs in various real-world applications.
Review for NeurIPS paper: Discovering Symbolic Models from Deep Learning with Inductive Biases
Paper presents an exciting area of research. All reviewers agree that the paper makes novel contributions. The one weak point of the current submission is that this work is not properly contextualized with prior work. Further, as authors said in their rebuttal -- it would be good to see comparisons with other SR packages and SR only baseline.
Sample Complexity of Automated Mechanism Design
The design of revenue-maximizing combinatorial auctions, i.e. multi item auctions over bundles of goods, is one of the most fundamental problems in computational economics, unsolved even for two bidders and two items for sale. In the traditional economic models, it is assumed that the bidders' valuations are drawn from an underlying distribution and that the auction designer has perfect knowledge of this distribution. Despite this strong and oftentimes unrealistic assumption, it is remarkable that the revenue-maximizing combinatorial auction remains unknown. In recent years, automated mechanism design has emerged as one of the most practical and promising approaches to designing high-revenue combinatorial auctions. The most scalable automated mechanism design algorithms take as input samples from the bidders' valuation distribution and then search for a high-revenue auction in a rich auction class.
A physics-based data-driven model for CO$_2$ gas diffusion electrodes to drive automated laboratories
Grega, Ivan, Therrien, Félix, Soni, Abhishek, Ocean, Karry, Dettelbach, Kevan, Ahmadi, Ribwar, Mokhtari, Mehrdad, Berlinguette, Curtis P., Bengio, Yoshua
The electrochemical reduction of atmospheric CO$_2$ into high-energy molecules with renewable energy is a promising avenue for energy storage that can take advantage of existing infrastructure especially in areas where sustainable alternatives to fossil fuels do not exist. Automated laboratories are currently being developed and used to optimize the composition and operating conditions of gas diffusion electrodes (GDEs), the device in which this reaction takes place. Improving the efficiency of GDEs is crucial for this technology to become viable. Here we present a modeling framework to efficiently explore the high-dimensional parameter space of GDE designs in an active learning context. At the core of the framework is an uncertainty-aware physics model calibrated with experimental data. The model has the flexibility to capture various input parameter spaces and any carbon products which can be modeled with Tafel kinetics. It is interpretable, and a Gaussian process layer can capture deviations of real data from the function space of the physical model itself. We deploy the model in a simulated active learning setup with real electrochemical data gathered by the AdaCarbon automated laboratory and show that it can be used to efficiently traverse the multi-dimensional parameter space.
Review for NeurIPS paper: Discovering Symbolic Models from Deep Learning with Inductive Biases
Additional Feedback: 0. The notations in the method section especially Section 2 need to be specified, even if it is easy to infer from context,. For example, L_v, v_i, v_j etc. need to be explained. Further, in the case studies sections, the descriptions are not clear, for example, the system should be explained mathematically from a n-body perspective, clearly denoting the particles as nodes at gnn equation level for atleast one case. The authors should discuss the intuitions behind their specific model decisions, for example, as this is a model discovery task, why haven't the authors used generative model frameworks? 2. The input/output dimensionality for eureqa fitting should be explained in Section 3, for example, GNs have multiple layers, how does the proposed method fit equations for the edge/node functions at different layers and put them together? From the simulation dataset, the underlying model does not seem to need multiple layers for GNs. 3. The Hamiltonian Dynamics section is very hard to read, especially to a non-physics person, it would be helpful if the authors add a clear description of the input (like position and momentum) and output for the HGN. 4. What is the intuition behind the sum of pairwise and self for the HGN? Have the authors compared to a model without this assumption? 5. Does the Bottleneck model perform worse simply because its a much smaller model than the other models with a large hidden size? 6. Line 170 states that "models are trained to predict acceleration given current state", do the authors not account for time dependence?
Physically consistent predictive reduced-order modeling by enhancing Operator Inference with state constraints
Numerical simulations of complex multiphysics systems, such as char combustion considered herein, yield numerous state variables that inherently exhibit physical constraints. This paper presents a new approach to augment Operator Inference -- a methodology within scientific machine learning that enables learning from data a low-dimensional representation of a high-dimensional system governed by nonlinear partial differential equations -- by embedding such state constraints in the reduced-order model predictions. In the model learning process, we propose a new way to choose regularization hyperparameters based on a key performance indicator. Since embedding state constraints improves the stability of the Operator Inference reduced-order model, we compare the proposed state constraints-embedded Operator Inference with the standard Operator Inference and other stability-enhancing approaches. For an application to char combustion, we demonstrate that the proposed approach yields state predictions superior to the other methods regarding stability and accuracy. It extrapolates over 200\% past the training regime while being computationally efficient and physically consistent.
Generalized Lie Symmetries in Physics-Informed Neural Operators
Wang, Amy Xiang, Shumaylov, Zakhar, Zaika, Peter, Sherry, Ferdia, Schönlieb, Carola-Bibiane
Physics-informed neural operators (PINOs) have emerged as powerful tools for learning solution operators of partial differential equations (PDEs). Recent research has demonstrated that incorporating Lie point symmetry information can significantly enhance the training efficiency of PINOs, primarily through techniques like data, architecture, and loss augmentation. In this work, we focus on the latter, highlighting that point symmetries oftentimes result in no training signal, limiting their effectiveness in many problems. To address this, we propose a novel loss augmentation strategy that leverages evolutionary representatives of point symmetries, a specific class of generalized symmetries of the underlying PDE. These generalized symmetries provide a richer set of generators compared to standard symmetries, leading to a more informative training signal. We demonstrate that leveraging evolutionary representatives enhances the performance of neural operators, resulting in improved data efficiency and accuracy during training.