AITopics | Europe

Collaborating Authors

Europe

Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning

Neural Information Processing SystemsJun-19-2026, 14:52:12 GMT

This paper provides the first expert sample complexity characterization for learning a Nash equilibrium from expert data in Markov Games. We show that a new quantity named the all policy deviation concentrability coefficient is unavoidable in the non-interactive imitation learning setting, and we provide an upper bound for behavioral cloning (BC) featuring such coefficient. BC exhibits substantial regret in games with high concentrability coefficient, leading us to utilize expert queries to develop and introduce two novel solution algorithms: MAIL-BRO and MURMAIL. The former employs a best response oracle and learns an ε-Nash equilibrium with O(ε 4)expert and oracle queries.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms

Neural Information Processing SystemsJun-19-2026, 14:46:10 GMT

We consider a stochastic multi-armed bandit problem with i.i.d.

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Europe (0.45)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.69)

Add feedback

UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights

Neural Information Processing SystemsJun-19-2026, 14:37:03 GMT

Recent advances in data-centric deep generative models have led to significant progress in solving inverse imaging problems. However, these models (e.g., diffusion models) typically require large amounts of fully sampled (clean) training data, which is often impractical in medical and scientific settings. Training-data-free approaches like Deep Image Prior (DIP) do not require clean images but suffer from noise overfitting and can be computationally expensive as the network parameters need to be optimized for each measurement vector independently. Moreover, DIPbased methods often overlook the potential of learning a prior using a small number of sub-sampled measurements (or degraded images) available during training. In this paper, we propose UGoDIT--an Unsupervised Group DIP via Transferable weights--designed for the low-data regime where only a very small number, M, of sub-sampled measurement vectors are available during training.

artificial intelligence, machine learning, ugodit, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pareto Optimal Risk Measure Agnostic Distributional Bandits with Heavy-Tail Rewards

Neural Information Processing SystemsJun-19-2026, 14:32:19 GMT

This paper addresses the problem of multi-risk measure agnostic multi-armed bandits in heavy-tailed reward settings. We propose a framework that leverages novel deviation inequalities for the 1-Wasserstein distance to construct confidence intervals for Lipschitz risk measures. The distributional LCB (DistLCB) algorithm is introduced, which achieves asymptotic optimality by deriving the first lower bounds for risk measure aware bandits with explicit sub-optimality gap dependencies. The DistLCB is further extended to multi-risk objectives, which enables Pareto-optimal solutions that consider multiple aspects of reward distributions. Additionally, we provide a regret analysis that includes both gap-dependent and gap-independent bounds for multi-risk settings. Experiments validate the effectiveness of the proposed methods in synthetic and real-world applications.

artificial intelligence, data mining, machine learning, (23 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.93)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Banking & Finance > Trading (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.85)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Continuous Diffusion Model for Language Modeling

Neural Information Processing SystemsJun-19-2026, 14:22:45 GMT

Diffusion models have emerged as a promising alternative to autoregressive models in modeling discrete categorical data. However, diffusion models that directly work on discrete data space fail to fully exploit the power of iterative refinement, as the signals are lost during transitions between discrete states. Existing continuous diffusion models for discrete data underperform compared to discrete methods, and the lack of a clear connection between the two approaches hinders the development of effective diffusion models for discrete data. In this work, we propose a continuous diffusion model for language modeling that incorporates the geometry of the underlying categorical distribution. We establish a connection between the discrete diffusion and continuous flow on the statistical manifold, and building on this analogy, introduce a simple diffusion process that generalizes existing discrete diffusion models. We further propose a simulation-free training framework based on radial symmetry, along with a simple technique to address the high dimensionality of the manifold. Comprehensive experiments on language modeling benchmarks and other modalities show that our method outperforms existing discrete diffusion models and approaches the performance of autoregressive models. The code is available at https://github.com/harryjo97/RDLM.

diffusion model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.92)

Genre: Research Report > Experimental Study (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

User 1000 Model4o 4o MistralMistral LLaMALLaMA QwenQwen Safety: 5/5 ModelSafety: 2/5

Neural Information Processing SystemsJun-19-2026, 14:10:31 GMT

Large language models (LLMs) typically generate identical or similar responses for all users given the same prompt, posing serious safety risks in high-stakes applications where user vulnerabilities differ widely. Existing safety evaluations primarily rely on context-independent metrics--such as factuality, bias, or toxicity--overlooking the fact that the same response may carry divergent risks depending on the user's background or condition. We introduce "personalized safety" to fill this gap and present PENGUIN--a benchmark comprising 14,000scenarios across seven sensitive domains with both context-rich and context-free variants. Evaluating six leading LLMs, we demonstrate that personalized user information significantly improves safety scores by 43.2%, confirming the effectiveness of personalization in safety alignment. However, not all context attributes contribute equally to safety enhancement. To address this, we develop RAISE--a training-free, two-stage agent framework that strategically acquires user-specific background. RAISE improves safety scores by up to 31.6%over six vanilla LLMs, while maintaining a low interaction cost of just 2.7 user queries on average. Our findings highlight the importance of selective information gathering in safety-critical domains and offer a practical solution for personalizing LLM responses without model retraining. This work establishes a foundation for safety research that adapts to individual user contexts rather than assuming a universal harm standard.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California (0.45)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.92)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Consumer Health (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CPO: Condition Preference Optimization for Controllable Image Generation

Neural Information Processing SystemsJun-19-2026, 14:01:33 GMT

To enhance controllability in text-to-image generation, ControlNet introduces image-based control signals, while ControlNet++ improves pixel-level cycle consistency between generated images and the input control signal. To avoid the prohibitive cost of back-propagating through the sampling process, ControlNet++ optimizes only low-noise timesteps (e.g., t < 200) using a single-step approximation, which not only ignores the contribution of high-noise timesteps but also introduces additional approximation errors. A straightforward alternative for optimizing controllability across all timesteps is Direct Preference Optimization (DPO), a fine-tuning method that increases model preference for more controllable images (Iw) over less controllable ones (Il). However, due to uncertainty in generative models, it is difficult to ensure that win-lose image pairs differ only in controllability while keeping other factors, such as image quality, fixed. To address this, we propose performing preference learning over control conditions rather than generated images.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Food & Agriculture (0.93)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Stable Matching with Ties: Approximation Ratios and Learning

Neural Information Processing SystemsJun-19-2026, 13:14:47 GMT

We study matching markets with ties, where workers on one side of the market may have tied preferences over jobs, determined by their matching utilities. Unlike classical two-sided markets with strict preferences, no single stable matching exists that is utility-maximizing for all workers. To address this challenge, we introduce the Optimal Stable Share (OSS)-ratio, which measures the ratio of a worker's maximum achievable utility in any stable matching to their utility in a given matching. We prove that distributions over only stable matchings can incur linear utility losses, i.e., an Ω(N) OSS-ratio, where N is the number of workers. To overcome this, we design an algorithm that efficiently computes a distribution over (possibly non-stable) matchings, achieving an asymptotically tight O(logN) OSS-ratio. When exact utilities are unknown, our second algorithm guarantees workers a logarithmic approximation of their optimal utility under bounded instability. Finally, we extend our offline approximation results to a bandit learning setting where utilities are only observed for matched pairs. In this setting, we consider worker-optimal stable regret, design an adaptive algorithm that smoothly interpolates between markets with strict preferences and those with statistical ties, and establish a lower bound revealing the fundamental trade-off between strict and tied preference regimes.

artificial intelligence, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America (0.27)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)
Information Technology > Game Theory (0.67)
(3 more...)

Add feedback

FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering

Neural Information Processing SystemsJun-19-2026, 13:01:02 GMT

While Multimodal Large Language Models (MLLMs) offer strong perception and reasoning capabilities for image-text input, Visual Question Answering (VQA) focusing on small image details still remains a challenge. Although visual cropping techniques seem promising, recent approaches have several limitations: the need for task-specific fine-tuning, low efficiency due to uninformed exhaustive search, or incompatibility with efficient attention implementations. We address these shortcomings by proposing a training-free visual cropping method, dubbed FOCUS, that leverages MLLM-internal representations to guide the search for the most relevant image region. This is accomplished in four steps: first, we identify the target object(s) in the VQA prompt; second, we compute an object relevance map using the key-value (KV) cache; third, we propose and rank relevant image regions based on the map; and finally, we perform the fine-grained VQA task using the topranked region.

llava-1, natural language, question answering, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.70)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)

Add feedback

PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis

Neural Information Processing SystemsJun-19-2026, 12:52:57 GMT

We introduce a comprehensive framework for modeling single cell transcriptomic responses to perturbations, aimed at standardizing benchmarking in this rapidly evolving field. Our approach includes a modular and user-friendly model development and evaluation platform, a collection of diverse perturbational datasets, and a set of metrics designed to fairly compare models and dissect their performance. Through extensive evaluation of both published and baseline models across diverse datasets, we highlight the limitations of widely used models, such as mode collapse. We also demonstrate the importance of rank metrics which complement traditional model fit measures, such as RMSE, for validating model effectiveness. Notably, our results show that while no single model architecture clearly outperforms others, simpler architectures are generally competitive and scale well with larger datasets. Overall, this benchmarking exercise sets new standards for model evaluation, supports robust model development, and furthers the use of these models to simulate genetic and chemical screens for therapeutic discovery.

artificial intelligence, machine learning, perturbation, (19 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.28)

Genre: Research Report > New Finding (1.00)

Industry: