subdomain
SupplementaryMaterials: Acomposable machine-learningapproachforsteady-state simulationsonhigh-resolutiongrids
Finally, we expand on the computational performance of CoMLSim in Section E and provide details of reproducibilityinSectionF. In this section, we will provide details about the typical network architectures used in CoMLSim followed bythetraining mechanics. CNN-based encoders and decoders are employed here toachievethis compression because subdomains consist of structured data representations. In the encoder network, we use a series of convolution and max-pooling layers to extract global features from thesolution. If the PDE conditions are uniform, the magnitude can simply be considered as an encoding for a given subdomain. Since latent vectors don't have a spatial representation, DNN-based encoder and decoders areemployedtocompress them. Thedomain isdiscretized intoafinite number ofcomputational elements, using techniques suchasFinite Difference Method (FDM), Finite Volume Method (FVM) and FiniteElementMethod(FEM). 3 Similar to traditional PDE solvers, the first step in the CoMLSim is to decompose the computational domain into smaller subdomains.
Domain-Decomposed Graph Neural Network Surrogate Modeling for Ice Sheets
Propp, Adrienne M., Perego, Mauro, Cyr, Eric C., Gruber, Anthony, Howard, Amanda A., Heinlein, Alexander, Stinis, Panos, Tartakovsky, Daniel M.
Accurate yet efficient surrogate models are essential for large-scale simulations of partial differential equations (PDEs), particularly for uncertainty quantification (UQ) tasks that demand hundreds or thousands of evaluations. We develop a physics-inspired graph neural network (GNN) surrogate that operates directly on unstructured meshes and leverages the flexibility of graph attention. To improve both training efficiency and generalization properties of the model, we introduce a domain decomposition (DD) strategy that partitions the mesh into subdomains, trains local GNN surrogates in parallel, and aggregates their predictions. We then employ transfer learning to fine-tune models across subdomains, accelerating training and improving accuracy in data-limited settings. Applied to ice sheet simulations, our approach accurately predicts full-field velocities on high-resolution meshes, substantially reduces training time relative to training a single global surrogate model, and provides a ripe foundation for UQ objectives. Our results demonstrate that graph-based DD, combined with transfer learning, provides a scalable and reliable pathway for training GNN surrogates on massive PDE-governed systems, with broad potential for application beyond ice sheet dynamics.
- Africa > Mali (0.05)
- North America > Greenland (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- (7 more...)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.93)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > Canada > Quebec > Montreal (0.04)
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance
Liu, Genglin, Geng, Shijie, Li, Sha, Cui, Hejie, Zhang, Sarah, Liu, Xin, Liu, Tianyi
Multimodal LLM-powered agents have recently demonstrated impressive capabilities in web navigation, enabling agents to complete complex browsing tasks across diverse domains. However, current agents struggle with repetitive errors and lack the ability to learn from past experiences across sessions, limiting their long-term robustness and sample efficiency. We introduce WebCoach, a model-agnostic self-evolving framework that equips web browsing agents with persistent cross-session memory, enabling improved long-term planning, reflection, and continual learning without retraining. WebCoach consists of three key components: (1) a WebCondenser, which standardizes raw navigation logs into concise summaries; (2) an External Memory Store, which organizes complete trajectories as episodic experiences; and (3) a Coach, which retrieves relevant experiences based on similarity and recency, and decides whether to inject task-specific advice into the agent via runtime hooks. This design empowers web agents to access long-term memory beyond their native context window, improving robustness in complex browsing tasks. Moreover, WebCoach achieves self-evolution by continuously curating episodic memory from new navigation trajectories, enabling agents to improve over time without retraining. Evaluations on the WebVoyager benchmark demonstrate that WebCoach consistently improves the performance of browser-use agents across three different LLM backbones. With a 38B model, it increases task success rates from 47% to 61% while reducing or maintaining the average number of steps. Notably, smaller base models with WebCoach achieve performance comparable to the same web agent using GPT-4o.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
- Information Technology (0.46)
- Health & Medicine (0.34)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > Canada (0.04)
- Asia > China > Beijing > Beijing (0.04)
Domain decomposition architectures and Gauss-Newton training for physics-informed neural networks
Heinlein, Alexander, Kapoor, Taniya
Approximating the solutions of boundary value problems governed by partial differential equations with neural networks is challenging, largely due to the difficult training process. This difficulty can be partly explained by the spectral bias, that is, the slower convergence of high-frequency components, and can be mitigated by localizing neural networks via (overlapping) domain decomposition. We combine this localization with the Gauss-Newton method as the optimizer to obtain faster convergence than gradient-based schemes such as Adam; this comes at the cost of solving an ill-conditioned linear system in each iteration. Domain decomposition induces a block-sparse structure in the otherwise dense Gauss-Newton system, reducing the computational cost per iteration. Our numerical results indicate that combining localization and Gauss-Newton optimization is promising for neural network-based solvers for partial differential equations.
BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains
Devane, Vijay, Nauman, Mohd, Patel, Bhargav, Wakchoure, Aniket Mahendra, Sant, Yogeshkumar, Pawar, Shyam, Thakur, Viraj, Godse, Ananya, Patra, Sunil, Maurya, Neha, Racha, Suraj, Singh, Nitish Kamal, Nagpal, Ajay, Sawarkar, Piyush, Pundalik, Kundeshwar Vijayrao, Saluja, Rohit, Ramakrishnan, Ganesh
The rapid advancement of large language models(LLMs) has intensified the need for domain and culture specific evaluation. Existing benchmarks are largely Anglocentric and domain-agnostic, limiting their applicability to India-centric contexts. To address this gap, we introduce BhashaBench V1, the first domain-specific, multi-task, bilingual benchmark focusing on critical Indic knowledge systems. BhashaBench V1 contains 74,166 meticulously curated question-answer pairs, with 52,494 in English and 21,672 in Hindi, sourced from authentic government and domain-specific exams. It spans four major domains: Agriculture, Legal, Finance, and Ayurveda, comprising 90+ subdomains and covering 500+ topics, enabling fine-grained evaluation. Evaluation of 29+ LLMs reveals significant domain and language specific performance gaps, with especially large disparities in low-resource domains. For instance, GPT-4o achieves 76.49% overall accuracy in Legal but only 59.74% in Ayurveda. Models consistently perform better on English content compared to Hindi across all domains. Subdomain-level analysis shows that areas such as Cyber Law, International Finance perform relatively well, while Panchakarma, Seed Science, and Human Rights remain notably weak. BhashaBench V1 provides a comprehensive dataset for evaluating large language models across India's diverse knowledge domains. It enables assessment of models' ability to integrate domain-specific knowledge with bilingual understanding. All code, benchmarks, and resources are publicly available to support open research.
- North America > United States (0.14)
- Asia > India > Maharashtra (0.04)
- Asia > Middle East > Jordan (0.04)
- (17 more...)
- Law > Statutes (1.00)
- Health & Medicine (1.00)
- Food & Agriculture > Agriculture (1.00)
- Government > Regional Government > Asia Government > India Government (0.46)
AstroMMBench: A Benchmark for Evaluating Multimodal Large Language Models Capabilities in Astronomy
Shi, Jinghang, Tang, Xiaoyu, Huang, Yang, Li, Yuyang, Kong, Xiao, Zhang, Yanxia, Yue, Caizhan
Astronomical image interpretation presents a significant challenge for applying multimodal large language models (MLLMs) to specialized scientific tasks. Existing benchmarks focus on general multimodal capabilities but fail to capture the complexity of astronomical data. To bridge this gap, we introduce AstroMMBench, the first comprehensive benchmark designed to evaluate MLLMs in astronomical image understanding. AstroMMBench comprises 621 multiple-choice questions across six astrophysical subfields, curated and reviewed by 15 domain experts for quality and relevance. We conducted an extensive evaluation of 25 diverse MLLMs, including 22 open-source and 3 closed-source models, using AstroMMBench. The results show that Ovis2-34B achieved the highest overall accuracy (70.5%), demonstrating leading capabilities even compared to strong closed-source models. Performance showed variations across the six astrophysical subfields, proving particularly challenging in domains like cosmology and high-energy astrophysics, while models performed relatively better in others, such as instrumentation and solar astrophysics. These findings underscore the vital role of domain-specific benchmarks like AstroMMBench in critically evaluating MLLM performance and guiding their targeted development for scientific applications. AstroMMBench provides a foundational resource and a dynamic tool to catalyze advancements at the intersection of AI and astronomy.
- North America > United States > California (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
O-Forge: An LLM + Computer Algebra Framework for Asymptotic Analysis
Large language models have recently demonstrated advanced capabilities in solving IMO and Putnam problems; yet their role in research mathematics has remained fairly limited. The key difficulty is verification: suggested proofs may look plausible, but cannot be trusted without rigorous checking. We present a framework, called LLM+CAS, and an associated tool, O-Forge, that couples frontier LLMs with a computer algebra systems (CAS) in an In-Context Symbolic Feedback loop to produce proofs that are both creative and symbolically verified. Our focus is on asymptotic inequalities, a topic that often involves difficult proofs and appropriate decomposition of the domain into the "right" subdomains. Many mathematicians, including Terry Tao, have suggested that using AI tools to find the right decompositions can be very useful for research-level asymptotic analysis. In this paper, we show that our framework LLM+CAS turns out to be remarkably effective at proposing such decompositions via a combination of a frontier LLM and a CAS. More precisely, we use an LLM to suggest domain decomposition, and a CAS (such as Mathematica) that provides a verification of each piece axiomatically. Using this loop, we answer a question posed by Terence Tao: whether LLMs coupled with a verifier can be used to help prove intricate asymptotic inequalities. More broadly, we show how AI can move beyond contest math towards research-level tools for professional mathematicians.
AB-PINNs: Adaptive-Basis Physics-Informed Neural Networks for Residual-Driven Domain Decomposition
Botvinick-Greenhouse, Jonah, Ali, Wael H., Benosman, Mouhacine, Mowlavi, Saviz
We introduce adaptive-basis physics-informed neural networks (AB-PINNs), a novel approach to domain decomposition for training PINNs in which existing subdomains dynamically adapt to the intrinsic features of the unknown solution. Drawing inspiration from classical mesh refinement techniques, we also modify the domain decomposition on-the-fly throughout training by introducing new subdomains in regions of high residual loss, thereby providing additional expressive power where the solution of the differential equation is challenging to represent. Our flexible approach to domain decomposition is well-suited for multiscale problems, as different subdomains can learn to capture different scales of the underlying solution. Moreover, the ability to introduce new subdomains during training helps prevent convergence to unwanted local minima and can reduce the need for extensive hyperparameter tuning compared to static domain decomposition approaches. Throughout, we present comprehensive numerical results which demonstrate the effectiveness of AB-PINNs at solving a variety of complex multiscale partial differential equations.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- (3 more...)
- Research Report (0.70)
- Overview (0.48)