AITopics | Europe

Collaborating Authors

Europe

Dynamic Configuration for Cutting Plane Separators via Reinforcement Learning on Incremental Graph

Neural Information Processing SystemsJun-22-2026, 17:57:44 GMT

Cutting planes (cuts) are essential for solving mixed-integer linear programming (MILP) problems, as they tighten the feasible solution space and accelerate the solving process. Modern MILP solvers offer diverse cutting plane separators to generate cuts, enabling users to leverage their potential complementary strengths to tackle problems with different structures. Recent machine learning approaches learn to configure separators based on problem-specific features, selecting effective separators and deactivating ineffective ones to save unnecessary computing time. However, they ignore the dynamics of separator efficacy at different stages of cut generation and struggle to adapt the configurations for the evolving problems after multiple rounds of cut generation. To address this challenge, we propose a novel dynamic separator configuration (DynSep) method that models separator configuration in different rounds as a reinforcement learning task, making decisions based on an incremental triplet graph updated by iteratively added cuts. Specifically, we tokenize the incremental subgraphs and utilize a decoder-only Transformer as our policy to autoregressively predict when to halt separation and which separators to activate at each round. Evaluated on synthetic and large-scale real-world MILP problems, DynSep speeds up average solving time by 64% on easy and medium datasets, and reduces primal-dual gap integral within the given time limit by 16% on hard datasets. Moreover, experiments demonstrate that DynSep well generalizes to MILP instances of significantly larger sizes than those seen during training.

machine learning, reinforcement learning, separator, (21 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.93)
North America > Canada (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance (0.46)
Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

On the Robustness of Verbal Confidence of LLMs in Adversarial Attacks

Neural Information Processing SystemsJun-22-2026, 17:56:36 GMT

Robust verbal confidence generated by large language models (LLMs) is crucial for the deployment of LLMs to help ensure transparency, trust, and safety in many applications, including those involving human-AI interactions. In this paper, we present the first comprehensive study on the robustness of verbal confidence under adversarial attacks. We introduce attack frameworks targeting verbal confidence scores through both perturbation and jailbreak-based methods, and demonstrate that these attacks can significantly impair verbal confidence estimates and lead to frequent answer changes. We examine a variety of prompting strategies, model sizes, and application domains, revealing that current verbal confidence is vulnerable and that commonly used defence techniques are largely ineffective or counterproductive. Our findings underscore the need to design robust mechanisms for confidence expression in LLMs, as even subtle semantic-preserving modifications can lead to misleading confidence in responses.

confidence score, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States (0.92)
Europe (0.92)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Normalization in Attention Dynamics

Neural Information Processing SystemsJun-22-2026, 17:48:10 GMT

We study the effect of normalization schemes on token representations in deep transformers. Modeling their evolution as interacting particles on the sphere, we show that normalization acts as a form of speed regulation. This perspective enables a unified analysis of several schemes--including Post-LN, Pre-LN, MixLN, Peri-LN, nGPT--revealing how they influence clustering dynamics and representation collapse. Our framework clarifies how different schemes shape token representations across layers and provides a principled basis for comparing them, identifying Peri-LN as a particularly effective choice.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DeCaFlow: A deconfounding causal generative model

Neural Information Processing SystemsJun-22-2026, 17:42:08 GMT

We introduce DeCaFlow, a deconfounding causal generative model. Training once per dataset using just observational data and the underlying causal graph, DeCaFlow enables accurate causal inference on continuous variables under the presence of hidden confounders. Specifically, we extend previous results on causal estimation under hidden confounding to show that a single instance of DeCaFlow provides correct estimates for all causal queries identifiable with do-calculus, leveraging proxy variables to adjust for the causal effects when do-calculus alone is insufficient. Moreover, we show that counterfactual queries are identifiable as long as their interventional counterparts are identifiable, and thus are also correctly estimated by DeCaFlow. Our empirical results on diverse settings--including the Ecoli70 dataset, with 3 independent hidden confounders, tens of observed variables and hundreds of causal queries--show that DeCaFlow outperforms existing approaches, while demonstrating its out-of-the-box applicability to any given causal graph.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)
North America > Canada > British Columbia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Education > Educational Setting (0.46)
Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

Training Language Models to Generate Quality Code with Program Analysis Feedback

Neural Information Processing SystemsJun-22-2026, 17:27:27 GMT

Code generation with large language models (LLMs), often termed vibe coding, is increasingly adopted in production but fails to ensure code quality, particularly in security (e.g., SQL injection vulnerabilities) and maintainability (e.g., missing type annotations). Existing methods, such as supervised fine-tuning and rule-based post-processing, rely on labor-intensive annotations or brittle heuristics, limiting their scalability and effectiveness. We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code using program analysis-guided feedback. Specifically, REAL integrates two automated signals: (1) program analysis detecting security or maintainability defects and (2) unit tests ensuring functional correctness. Unlike prior work, our framework is prompt-agnostic and reference-free, enabling scalable supervision without manual intervention. Experiments across multiple datasets and model scales demonstrate that REAL outperforms state-of-the-art methods in simultaneous assessments of functionality and code quality. Our work bridges the gap between rapid prototyping and production-ready code, enabling LLMs to deliver both speed and quality.

large language model, natural language, vulnerability, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Austria (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Training-Free Safe Denoisers for Safe Use of Diffusion Models

Neural Information Processing SystemsJun-22-2026, 17:27:06 GMT

Many existing methods tackle these issues by heavily relying on text-based negative prompts or retraining the model to eliminate certain features or samples.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Law (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

How Ensembles of Distilled Policies Improve Generalisation in Reinforcement Learning

Neural Information Processing SystemsJun-22-2026, 17:26:36 GMT

In the zero-shot policy transfer setting in reinforcement learning, the goal is to train an agent on a fixed set of training environments so that it can generalise to similar, but unseen, testing environments. Previous work has shown that policy distillation after training can sometimes produce a policy that outperforms the original in the testing environments. However, it is not yet entirely clear why that is, or what data should be used to distil the policy. In this paper, we prove, under certain assumptions, a generalisation bound for policy distillation after training. The theory provides two practical insights: for improved generalisation, you should 1) train an ensemble of distilled policies, and 2) distil it on as much data from the training environments as possible. We empirically verify that these insights hold in more general settings, when the assumptions required for the theory no longer hold. Finally, we demonstrate that an ensemble of policies distilled on a diverse dataset can generalise significantly better than the original agent.

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (1.00)
Oceania > Australia > New South Wales (0.27)
North America > United States > California (0.27)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction

Neural Information Processing SystemsJun-22-2026, 17:17:35 GMT

We present an efficient algorithm for linear contextual bandits with adversarial losses and stochastic action sets. Our approach reduces this setting to misspecification-robust adversarial linear bandits with fixed action sets. Without knowledge of the context distribution or access to a context simulator, the algorithm achieves eO(min{d2 T, p d3T logK})regret and runs in poly(d,C,T) time, where d is the feature dimension, C is an upper bound on the number of linear constraints defining the action set in each round, K is an upper bound on the number of actions in each round, and T is number of rounds. This resolves the open question by Liu et al. (2023) on whether one can obtain poly(d) T regret in polynomial time independent of the number of actions. For the important class of combinatorial bandits with adversarial losses and stochastic action sets where the action sets can be described by a polynomial number of linear constraints, our algorithm is the first to achieve poly(d) T regret in polynomial time, while no prior algorithm achieves even o(T) regret in polynomial time to our knowledge. When a simulator is available, the regret bound can be improved to eO(d L), where L is the cumulative loss of the best policy.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Netherlands (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers

Neural Information Processing SystemsJun-22-2026, 17:16:20 GMT

Large language models have been widely integrated into information retrieval to advance traditional techniques. However, effectively enabling LLMs to seek accurate knowledge in complex tasks remains a challenge due to the complexity of multi-hop queries as well as the irrelevant retrieved content. To address these limitations, we propose EXSEARCH, an agentic search framework, where the LLM learns to retrieve useful information as the reasoning unfolds through a selfincentivized process. At each step, the LLM decides what to retrieve (thinking), triggers an external retriever (search), and extracts fine-grained evidence (recording) to support next-step reasoning. To enable LLM with this capability, EXSEARCH adopts a generalized expectation-maximization algorithm.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)
North America > Canada > Ontario (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)
Research Report > New Finding (0.92)
Workflow (0.66)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How Benchmark Prediction from Fewer Data Misses the Mark

Neural Information Processing SystemsJun-22-2026, 17:16:00 GMT

Large language model (LLM) evaluation is increasingly costly, prompting interest in methods that speed up evaluation by shrinking benchmark datasets. Benchmark prediction (also called efficient LLM evaluation) aims to select a small subset of evaluation points and predict overall benchmark performance from that subset. In this paper, we systematically assess the strengths and limitations of 11 benchmark prediction methods across 19 diverse benchmarks. First, we identify a highly competitive baseline: Take a random sample and fit a regression model on the sample to predict missing entries. Outperforming most existing methods, this baseline challenges the assumption that careful subset selection is necessary for benchmark prediction.

large language model, machine learning, target model, (18 more...)

Neural Information Processing Systems

Country: Europe (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback