Model-Based Reasoning
ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction Supplementary Material Juan Nathaniel
ChaosBench is published under the open source GNU General Public License. Furthermore, we are committed to maintaining and preserving the ChaosBench benchmark. User feedback will be closely monitored via the GitHub issue tracker. Dataset: All our dataset, present and future (e.g., with more years, multi-resolution support, etc) are Here, we provide a detailed description on how to prepare the necessary data, perform training, and benchmark your own model. B.1 Data Preparation First, navigate to the repository directory and install the necessary dependencies.
BaMANI: Bayesian Multi-Algorithm causal Network Inference
Latifizadeh, Habibolla, Pirkey, Anika C., Gould, Alanna, Klinke, David J. II
Improved computational power has enabled different disciplines to predict causal relationships among modeled variables using Bayesian network inference. While many alternative algorithms have been proposed to improve the efficiency and reliability of network prediction, the predicted causal networks reflect the generative process but also bear an opaque imprint of the specific computational algorithm used. Following a ``wisdom of the crowds" strategy, we developed an ensemble learning approach to marginalize the impact of a single algorithm on Bayesian causal network inference. To introduce the approach, we first present the theoretical foundation of this framework. Next, we present a comprehensive implementation of the framework in terms of a new software tool called BaMANI (Bayesian Multi-Algorithm causal Network Inference). Finally, we describe a BaMANI use-case from biology, particularly within human breast cancer studies.
Visual Perception Engine: Fast and Flexible Multi-Head Inference for Robotic Vision Tasks
ลucki, Jakub, Becktor, Jonathan, Georgakis, Georgios, Royce, Rob, Khattak, Shehryar
Deploying multiple machine learning models on resource-constrained robotic platforms for different perception tasks often results in redundant computations, large memory footprints, and complex integration challenges. In response, this work presents Visual Perception Engine (VPEngine), a modular framework designed to enable efficient GPU usage for visual multitasking while maintaining extensibility and developer accessibility. Our framework architecture leverages a shared foundation model backbone that extracts image representations, which are efficiently shared, without any unnecessary GPU-CPU memory transfers, across multiple specialized task-specific model heads running in parallel. This design eliminates the computational redundancy inherent in feature extraction component when deploying traditional sequential models while enabling dynamic task prioritization based on application demands. We demonstrate our framework's capabilities through an example implementation using DINOv2 as the foundation model with multiple task (depth, object detection and semantic segmentation) heads, achieving up to 3x speedup compared to sequential execution. Building on CUDA Multi-Process Service (MPS), VPEngine offers efficient GPU utilization and maintains a constant memory footprint while allowing per-task inference frequencies to be adjusted dynamically during runtime. The framework is written in Python and is open source with ROS2 C++ (Humble) bindings for ease of use by the robotics community across diverse robotic platforms. Our example implementation demonstrates end-to-end real-time performance at $\geq$50 Hz on NVIDIA Jetson Orin AGX for TensorRT optimized models.
InfoCausalQA:Can Models Perform Non-explicit Causal Reasoning Based on Infographic?
Ka, Keummin, Park, Junhyeong, Jeon, Jaehyun, Yu, Youngjae
Recent advances in Vision-Language Models (VLMs) have demonstrated impressive capabilities in perception and reasoning. However, the ability to perform causal inference -- a core aspect of human cognition -- remains underexplored, particularly in multimodal settings. In this study, we introduce InfoCausalQA, a novel benchmark designed to evaluate causal reasoning grounded in infographics that combine structured visual data with textual context. The benchmark comprises two tasks: Task 1 focuses on quantitative causal reasoning based on inferred numerical trends, while Task 2 targets semantic causal reasoning involving five types of causal relations: cause, effect, intervention, counterfactual, and temporal. We manually collected 494 infographic-text pairs from four public sources and used GPT-4o to generate 1,482 high-quality multiple-choice QA pairs. These questions were then carefully revised by humans to ensure they cannot be answered based on surface-level cues alone but instead require genuine visual grounding. Our experimental results reveal that current VLMs exhibit limited capability in computational reasoning and even more pronounced limitations in semantic causal reasoning. Their significantly lower performance compared to humans indicates a substantial gap in leveraging infographic-based information for causal inference. Through InfoCausalQA, we highlight the need for advancing the causal reasoning abilities of multimodal AI systems.
Extracting Complex Topology from Multivariate Functional Approximation: Contours, Jacobi Sets, and Ridge-Valley Graphs
Ma, Guanqun, Lenz, David, Guo, Hanqi, Peterka, Tom, Wang, Bei
Implicit continuous models, such as functional models and implicit neural networks, are an increasingly popular method for replacing discrete data representations with continuous, high-order, and differentiable surrogates. These models offer new perspectives on the storage, transfer, and analysis of scientific data. In this paper, we introduce the first framework to directly extract complex topological features -- contours, Jacobi sets, and ridge-valley graphs -- from a type of continuous implicit model known as multivariate functional approximation (MFA). MFA replaces discrete data with continuous piecewise smooth functions. Given an MFA model as the input, our approach enables direct extraction of complex topological features from the model, without reverting to a discrete representation of the model. Our work is easily generalizable to any continuous implicit model that supports the queries of function values and high-order derivatives. Our work establishes the building blocks for performing topological data analysis and visualization on implicit continuous models.
Time Marching Neural Operator FE Coupling: AI Accelerated Physics Modeling
Wang, Wei, Hakimzadeh, Maryam, Ruan, Haihui, Goswami, Somdatta
Numerical solvers for PDEs often struggle to balance computational cost with accuracy, especially in multiscale and time-dependent systems. Neural operators offer a promising way to accelerate simulations, but their practical deployment is hindered by several challenges: they typically require large volumes of training data generated from high-fidelity solvers, tend to accumulate errors over time in dynamical settings, and often exhibit poor generalization in multiphysics scenarios. This work introduces a novel hybrid framework that integrates physics-informed deep operator network with FEM through domain decomposition and leverages numerical analysis for time marching. Our innovation lies in efficient coupling FE and DeepONet subdomains via a Schwarz method, expecting to solve complex and nonlinear regions by a pretrained DeepONet, while the remainder is handled by conventional FE. To address the challenges of dynamic systems, we embed a time stepping scheme directly into the DeepONet, substantially reducing long-term error propagation. Furthermore, an adaptive subdomain evolution strategy enables the ML-resolved region to expand dynamically, capturing fine-scale features without remeshing. Our framework shows accelerated convergence rates (up to 20% improvement in convergence rates compared to conventional FE coupling approaches) while preserving solution fidelity with error margins consistently below 3%. Our study shows that our proposed hybrid solver: (1) reduces computational costs by eliminating fine mesh requirements, (2) mitigates error accumulation in time-dependent simulations, and (3) enables automatic adaptation to evolving physical phenomena. This work establishes a new paradigm for coupling state of the art physics based and machine learning solvers in a unified framework, offering a robust, reliable, and scalable pathway for high fidelity multiscale simulations.
MMET: A Multi-Input and Multi-Scale Transformer for Efficient PDEs Solving
Luo, Yichen, Wang, Jia, Lan, Dapeng, Liu, Yu, Pang, Zhibo
Partial Differential Equations (PDEs) are fundamental for modeling physical systems, yet solving them in a generic and efficient manner using machine learning-based approaches remains challenging due to limited multi-input and multi-scale generalization capabilities, as well as high computational costs. This paper proposes the Multi-input and Multi-scale Efficient Transformer (MMET), a novel framework designed to address the above challenges. MMET decouples mesh and query points as two sequences and feeds them into the encoder and decoder, respectively, and uses a Gated Condition Embedding (GCE) layer to embed input variables or functions with varying dimensions, enabling effective solutions for multi-scale and multi-input problems. Additionally, a Hilbert curve-based reserialization and patch embedding mechanism decrease the input length. This significantly reduces the computational cost when dealing with large-scale geometric models. These innovations enable efficient representations and support multi-scale resolution queries for large-scale and multi-input PDE problems. Experimental evaluations on diverse benchmarks spanning different physical fields demonstrate that MMET outperforms SOTA methods in both accuracy and computational efficiency. This work highlights the potential of MMET as a robust and scalable solution for real-time PDE solving in engineering and physics-based applications, paving the way for future explorations into pre-trained large-scale models in specific domains. This work is open-sourced at https://github.com/YichenLuo-0/MMET.