AITopics | Genre

Collaborating Authors

Genre

Bench-V: APrimary Assessment for Visual Reasoning Models with Multi-modal Outputs

Neural Information Processing SystemsJun-23-2026, 02:12:37 GMT

The rapid advancement of native multi-modal models and omni-models, exemplified by GPT-4o, Gemini and o3 with their capability to process and generate content across modalities such as text and images, marks a significant milestone in the evolution of intelligence. Systematic evaluation of their multi-modal output capabilities in visual thinking process (a.k.a., multi-modal chain of thought, M-CoT) becomes critically important. However, existing benchmarks for evaluating multi-modal models primarily focus on assessing multi-modal inputs and text-only reasoning process while neglecting the importance of reasoning through multi-modal outputs. In this paper, we present a benchmark, dubbed as RBench-V, designed to assess models' vision-indispensable reasoning. To conduct RBench-V, we carefully hand-pick 803 questions covering math, physics, counting and games. Unlike problems in previous benchmarks, which typically specify certain input modalities, RBench-V presents problems centered on multi-modal outputs, which require image manipulation, such as generating novel images and constructing auxiliary lines to support reasoning process. We evaluate numerous open-and closed-source models on RBench-V, including o3, Gemini 2.5 pro, Qwen2.5VL,

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Overview (0.67)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives

Neural Information Processing SystemsJun-23-2026, 02:12:26 GMT

The first one, MA-SPL, not only can achieve the optimal (1 ce)-approximation guarantee for the MA-OC problem with submodular objectives but also can handle the unexplored α-weakly DR-submodular and (γ,β)-weakly submodular scenarios, where c is the curvature of the investigated submodular functions, α denotes the diminishing-return(DR) ratio and the tuple (γ,β) represents the submodularity ratios. Subsequently, in order to reduce the reliance on the unknown parameters α,γ,β inherent in the MA-SPLalgorithm, we further introduce the second online algorithm named MA-MPL. This MA-MPL algorithm is entirely parameter-free and simultaneously can maintain the same approximation ratio as the first MA-SPL algorithm. The core of our MA-SPL and MA-MPL algorithms is a novel continuous-relaxation technique termed as policybased continuous extension. Compared with the well-established multi-linear extension, a notable advantage of this new policy-based continuous extension is its ability to provide a lossless rounding scheme for any set function, thereby enabling us to tackle the challenging weakly submodular objectives. Finally, extensive simulations are conducted to validate the effectiveness of our proposed algorithms.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
Europe > Austria (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(4 more...)

Add feedback

Bridging Arbitrary and Tree Metrics via Differentiable Gromov Hyperbolicity

Neural Information Processing SystemsJun-23-2026, 02:12:15 GMT

Trees and the associated shortest-path tree metrics provide a powerful framework for representing hierarchical and combinatorial structures in data. Given an arbitrary metric space, its deviation from a tree metric can be quantified by Gromov's δhyperbolicity. Nonetheless, designing algorithms that bridge an arbitrary metric to its closest tree metric is still a vivid subject of interest, as most common approaches are either heuristical and lack guarantees, or perform moderately well. In this work, we introduce a novel differentiable optimization framework, coined DELTAZERO, that solves this problem. Our method leverages a smooth surrogate for Gromov's δ-hyperbolicity which enables a gradient-based optimization, with a tractable complexity. The corresponding optimization procedure is derived from a problem with better worst case guarantees than existing bounds, and is justified statistically. Experiments on synthetic and real-world datasets demonstrate that our method consistently achieves state-of-the-art distortion.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

f0156a82b6af6a4e838923ce9c124424-Paper-Conference.pdf

Neural Information Processing SystemsJun-23-2026, 02:12:07 GMT

Structure-agnostic causal inference studies how well one can estimate a treatment effect given black-box machine learning estimates of nuisance functions (like the impact of confounders on treatment and outcomes). Here, we find that the answer depends in a surprising way on the distribution of the treatment noise. Focusing on the partially linear model of Robinson [1988], we first show that the widely adopted double machine learning (DML) estimator is minimax rate-optimal for Gaussian treatment noise, resolving an open problem of Mackey et al. [2018]. Meanwhile, for independent non-Gaussian treatment noise, we show that DML is always suboptimal by constructing new practical procedures with higher-order robustness to nuisance errors. These ACE procedures use structure-agnostic cumulant estimators to achieve r-th order insensitivity to nuisance errors whenever the (r + 1)-st treatment cumulant is non-zero. We complement these core results with novel minimax guarantees for binary treatments in the partially linear model. Finally, using synthetic demand estimation experiments, we demonstrate the practical benefits of our higher-order robust estimators.

artificial intelligence, assumption, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

4KAgent: Agentic Any Image to 4KSuper-Resolution

Neural Information Processing SystemsJun-23-2026, 02:12:02 GMT

We present 4KAgent, a unified agentic super-resolution generalist system designed to universally upscale any image to 4K resolution (and even higher, if applied iteratively). Our system can transform images from extremely low resolutions with severe degradations, for example, highly distorted inputs at 256 256, into crystal-clear, photorealistic 4K outputs.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.92)
North America > United States (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Government (1.00)
(7 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(11 more...)

Add feedback

LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering

Neural Information Processing SystemsJun-23-2026, 02:11:39 GMT

In this work, we present a novel level-of-detail (LOD) method for 3DGaussian Splatting that enables real-time rendering of large-scale scenes on memoryconstrained devices. Our approach introduces a hierarchical LOD representation that iteratively selects optimal subsets of Gaussians based on camera distance, thus largely reducing both rendering time and GPU memory usage. We construct each LOD level by applying a depth-aware 3D smoothing filter, followed by importancebased pruning and fine-tuning to maintain visual fidelity. To further reduce memory overhead, we partition the scene into spatial chunks and dynamically load only relevant Gaussians during rendering, employing an opacity-blending mechanism to avoid visual artifacts at chunk boundaries. Our method achieves state-of-the-art performance on both outdoor (Hierarchical 3DGS) and indoor (Zip-NeRF) datasets, delivering high-quality renderings with reduced latency and memory requirements.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Partial Correlation Network Estimation by Semismooth Newton Methods

Neural Information Processing SystemsJun-23-2026, 02:11:26 GMT

We develop a scalable second-order algorithm for a recently proposed ℓ1regularized pseudolikelihood-based partial correlation network estimation framework. While the latter method admits statistical guarantees and is inherently scalable compared to likelihood-based methods such as graphical lasso, the currently available implementations rely only on first-order information and require thousands of iterations to obtain reliable estimates even on high-performance supercomputers. In this paper, we further investigate the inherent scalability of the framework and propose locally and globally convergent semismooth Newton methods. Despite the nonsmoothness of the problem, these second-order algorithms converge at a locally quadratic rate, and require only a few tens of iterations in practice. Each iteration reduces to solving linear systems of small dimensions or linear complementary problems of smaller dimensions, making the computation also suitable for less powerful computing environments. Experiments on both simulated and real-world genomic datasets demonstrate the superior convergence behavior and computational efficiency of the proposed algorithm, which position our method as a promising tool for massive-scale network analysis sought for in, e.g., modern multi-omics research.

artificial intelligence, iteration, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Single-Step Operator Learning for Conditioned Time-Series Diffusion Models

Neural Information Processing SystemsJun-23-2026, 02:11:19 GMT

Diffusion models have achieved significant success, yet their application to time series data, particularly with regard to efficient sampling, remains an active area of research. We describe an operator-learning approach for conditioned timeseries diffusion models that gives efficient single-step generation by leveraging insights from the frequency-domain characteristics of both the time-series data and the diffusion process itself. The forward diffusion process induces a structured, frequency-dependent smoothing of the data's probability density function. However, this frequency smoothing is related (e.g., via likelihood function) to easily accessible frequency components of time-series data. This suggests that a module operating in the frequency space of the time-series can, potentially, more effectively learn to reverse the frequency-dependent smoothing of the data distribution induced by the diffusion process. We set up an operator learning task, based on frequency-aware building blocks, which satisfies semigroup properties, while exploiting the structure of time-series data. Evaluations on multiple datasets show that our single-step generation proposal achieves forecasting/imputation results comparable (or superior) to many multi-step diffusion schemes while significantly reducing inference costs.

artificial intelligence, international conference, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Energy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Energy Loss Functions for Physical Systems

Neural Information Processing SystemsJun-23-2026, 02:04:42 GMT

Effectively leveraging prior knowledge of a system's physics is crucial for applications of machine learning to scientific domains. Previous approaches mostly focused on incorporating physical insights at the architectural level. In this paper, we propose a framework to leverage physical information directly into the loss function for prediction and generative modeling tasks on systems like molecules and spins. We derive energy loss functions assuming that each data sample is in thermal equilibrium with respect to an approximate energy landscape. By using the reverse KL divergence with a Boltzmann distribution around the data, we obtain the loss as an energy difference between the data and the model predictions.

artificial intelligence, loss function, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Instructional Material (0.67)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mitigating Occlusions in Virtual Try-On via A Simple-Yet-Effective Mask-Free Framework

Neural Information Processing SystemsJun-23-2026, 02:04:36 GMT

This paper investigates the occlusion problems in virtual try-on (VTON) tasks. According to how they affect the try-on results, the occlusion issues of existing VTON methods can be grouped into two categories: (1) Inherent Occlusions, which are the ghosts of the clothing from reference input images that exist in the try-on results.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.30)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback