Goto

Collaborating Authors

 subdomain


Clip-and-Verify: Linear Constraint-Driven Domain Clipping for Accelerating Neural Network Verification

Neural Information Processing Systems

State-of-the-art neural network (NN) verifiers demonstrate that applying the branchand-bound (BaB) procedure with fast bounding techniques plays a key role in tackling many challenging verification properties. In this work, we introduce the linear constraint-driven clipping framework, a class of scalable and efficient methods designed to enhance the efficacy of NN verifiers. Under this framework, we develop two novel algorithms that efficiently utilize linear constraints to 1) reduce portions of the input space that are either verified or irrelevant to a subproblem in the context of branch-and-bound, and 2) directly improve intermediate bounds throughout the network. The process novelly leverages linear constraints that often arise from bound propagation methods and is general enough to also incorporate constraints from other sources. It efficiently handles linear constraints using a specialized GPU procedure that can scale to large neural networks without the use of expensive external solvers. Our verification procedure, Clip-and-Verify, consistently tightens bounds across multiple benchmarks and can significantly reduce the number of subproblems handled during BaB. We show that our clipping algorithms can be integrated with BaB-based verifiers such as ฮฑ,ฮฒ-CROWN, utilizing either the split constraints in activation-space BaB or the output constraints that denote the unverified input space. We demonstrate the effectiveness of our procedure on a broad range of benchmarks where, in some instances, we witness a 96% reduction in the number of subproblems during branch-and-bound, and also achieve state-of-the-art verified accuracy across multiple benchmarks. Clip-and-Verify is part of the ฮฑ,ฮฒ-CROWNverifier, the VNN-COMP 2025 winner.


caSub Pair xt .

Neural Information Processing Systems

Omit references to the index or number of the sub-images, such as (xx), left, right, etc.3. There might be a common prefix or suffix caption shared among all sub-images at the beginning, end, or within the caption. Please incorporate the prefix or suffix into each sub-image's caption. If one subcaption contains context for multiple other subcaptions, add that context to each of the relevant subcaptions.4. The final output should be in JSON format, with an outer field'subcaptions', with a value that is a list of'subfigure' and'subcaption' dictionaries.5. If a subfigure contains more nested figures, i.e. subfigure (A) contains references to (left) and (right), add a field called "location" that stores the "left" or "right".6. If there are no references to sub-images, give a single subcaption with label "A".User Prompt:You are a research paper processor which splits the captions of figures into sub-captions that correspond with subfigures.System Prompt:"(a) H&E image of a breast tumor tissue. Fluorescently labeled markers superimposed as green color on the H&E image, (b) \u03b2-catenin, (c) pan-keratin, and (d) smooth muscle \u03b1-actin, markers.":{"subcaptions":


WritingBench: AComprehensive Benchmark for Generative Writing

Neural Information Processing Systems

Recent advancements in large language models (LLMs) have significantly enhanced text generation capabilities, yet evaluating their performance in generative writing remains a challenge. Existing benchmarks primarily focus on generic text generation or limited in writing tasks, failing to capture the diverse requirements of high-quality written contents across various domains. To bridge this gap, we present WritingBench, a comprehensive benchmark designed to evaluate LLMs across 6 core writing domains and 100 subdomains.We further propose a querydependent evaluation framework that empowers LLMs to dynamically generate instance-specific assessment criteria. This framework is complemented by a finetuned critic model for criteria-aware scoring, enabling evaluations in style, format and length. The framework's validity is further demonstrated by its data curation capability, which enables a 7B-parameter model to outperform the performance of GPT-4o in writing. We open-source the benchmark, along with evaluation tools and modular framework components, to advance the development of LLMs in writing.



paper-oras-neurips

Neural Information Processing Systems

Domain decomposition methods are widely used and effective in the approximation of solutions to partial differential equations. Yet the optimal construction of these methods requires tedious analysis and is often available only in simplified, structured-grid settings, limiting their use for more complex problems. In this work, we generalize optimized Schwarz domain decomposition methods to unstructured-grid problems, using Graph Convolutional Neural Networks (GCNNs) and unsupervised learning to learn optimal modifications at subdomain interfaces. A key ingredient in our approach is an improved loss function, enabling effective training on relatively small problems, but robust performance on arbitrarily large problems, with computational cost linear in problem size. The performance of the learned linear solvers is compared with both classical and optimized domain decomposition algorithms, for both structured-and unstructured-grid problems.


SupplementaryMaterials: Acomposable machine-learningapproachforsteady-state simulationsonhigh-resolutiongrids

Neural Information Processing Systems

Finally, we expand on the computational performance of CoMLSim in Section E and provide details of reproducibilityinSectionF. In this section, we will provide details about the typical network architectures used in CoMLSim followed bythetraining mechanics. CNN-based encoders and decoders are employed here toachievethis compression because subdomains consist of structured data representations. In the encoder network, we use a series of convolution and max-pooling layers to extract global features from thesolution. If the PDE conditions are uniform, the magnitude can simply be considered as an encoding for a given subdomain. Since latent vectors don't have a spatial representation, DNN-based encoder and decoders areemployedtocompress them. Thedomain isdiscretized intoafinite number ofcomputational elements, using techniques suchasFinite Difference Method (FDM), Finite Volume Method (FVM) and FiniteElementMethod(FEM). 3 Similar to traditional PDE solvers, the first step in the CoMLSim is to decompose the computational domain into smaller subdomains.


Domain-Decomposed Graph Neural Network Surrogate Modeling for Ice Sheets

arXiv.org Artificial Intelligence

Accurate yet efficient surrogate models are essential for large-scale simulations of partial differential equations (PDEs), particularly for uncertainty quantification (UQ) tasks that demand hundreds or thousands of evaluations. We develop a physics-inspired graph neural network (GNN) surrogate that operates directly on unstructured meshes and leverages the flexibility of graph attention. To improve both training efficiency and generalization properties of the model, we introduce a domain decomposition (DD) strategy that partitions the mesh into subdomains, trains local GNN surrogates in parallel, and aggregates their predictions. We then employ transfer learning to fine-tune models across subdomains, accelerating training and improving accuracy in data-limited settings. Applied to ice sheet simulations, our approach accurately predicts full-field velocities on high-resolution meshes, substantially reduces training time relative to training a single global surrogate model, and provides a ripe foundation for UQ objectives. Our results demonstrate that graph-based DD, combined with transfer learning, provides a scalable and reliable pathway for training GNN surrogates on massive PDE-governed systems, with broad potential for application beyond ice sheet dynamics.



Neural network-driven domain decomposition for efficient solutions to the Helmholtz equation

arXiv.org Artificial Intelligence

Accurately simulating wave propagation is crucial in fields such as acoustics, electromagnetism, and seismic analysis. Traditional numerical methods, like finite difference and finite element approaches, are widely used to solve governing partial differential equations (PDEs) such as the Helmholtz equation. However, these methods face significant computational challenges when applied to high-frequency wave problems in complex two-dimensional domains. This work investigates Finite Basis Physics-Informed Neural Networks (FBPINNs) and their multilevel extensions as a promising alternative. These methods leverage domain decomposition, partitioning the computational domain into overlapping sub-domains, each governed by a local neural network. We assess their accuracy and computational efficiency in solving the Helmholtz equation for the homogeneous case, demonstrating their potential to mitigate the limitations of traditional approaches.


WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

arXiv.org Artificial Intelligence

Multimodal LLM-powered agents have recently demonstrated impressive capabilities in web navigation, enabling agents to complete complex browsing tasks across diverse domains. However, current agents struggle with repetitive errors and lack the ability to learn from past experiences across sessions, limiting their long-term robustness and sample efficiency. We introduce WebCoach, a model-agnostic self-evolving framework that equips web browsing agents with persistent cross-session memory, enabling improved long-term planning, reflection, and continual learning without retraining. WebCoach consists of three key components: (1) a WebCondenser, which standardizes raw navigation logs into concise summaries; (2) an External Memory Store, which organizes complete trajectories as episodic experiences; and (3) a Coach, which retrieves relevant experiences based on similarity and recency, and decides whether to inject task-specific advice into the agent via runtime hooks. This design empowers web agents to access long-term memory beyond their native context window, improving robustness in complex browsing tasks. Moreover, WebCoach achieves self-evolution by continuously curating episodic memory from new navigation trajectories, enabling agents to improve over time without retraining. Evaluations on the WebVoyager benchmark demonstrate that WebCoach consistently improves the performance of browser-use agents across three different LLM backbones. With a 38B model, it increases task success rates from 47% to 61% while reducing or maintaining the average number of steps. Notably, smaller base models with WebCoach achieve performance comparable to the same web agent using GPT-4o.