AITopics | Europe

Collaborating Authors

Europe

Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs

Neural Information Processing SystemsJun-16-2026, 09:25:50 GMT

The selection of hyperparameters, such as prompt templates in large language models (LLMs), must often strike a balance between reliability and cost. In many cases, structural relationships between the expected reliability levels of the hyperparameters can be inferred from prior information and held-out data - e.g., longer prompt templates may be more detailed and thus more reliable. However, existing hyperparameter selection methods either do not provide formal reliability guarantees or are unable to incorporate structured knowledge in the hyperparameter space. This paper introduces reliability graph-based Pareto testing (RG-PT), a novel multi-objective hyperparameter selection framework that maintains formal reliability guarantees in terms of false discovery rate (FDR), while accounting for known relationships among hyperparameters via a directed acyclic graph. Edges in the graph reflect expected reliability and cost trade-offs among hyperparameters, which are inferred via the Bradley-Terry (BT) ranking model from prior information and held-out data. Experimental evaluations demonstrate that RG-PT significantly outperforms existing methods such as learn-then-test (LTT) and Pareto testing (PT) through a more efficient exploration of the hyperparameter space.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

FSNet: Feasibility-Seeking Neural Network for Constrained Optimization with Guarantees

Neural Information Processing SystemsJun-16-2026, 09:24:26 GMT

Efficiently solving constrained optimization problems is crucial for numerous realworld applications, yet traditional solvers are often computationally prohibitive for real-time use. Machine learning-based approaches have emerged as a promising alternative to provide approximate solutions at faster speeds, but they struggle to strictly enforce constraints, leading to infeasible solutions in practice. To address this, we propose the Feasibility-Seeking Neural Network (FSNet), which integrates a feasibility-seeking step directly into its solution procedure to ensure constraint satisfaction. This feasibility-seeking step solves an unconstrained optimization problem that minimizes constraint violations in a differentiable manner, enabling end-to-end training and providing guarantees on feasibility and convergence. Our experiments across a range of different optimization problems, including both smooth/nonsmooth and convex/nonconvex problems, demonstrate that FSNet can provide feasible solutions with solution quality comparable to (or in some cases better than) traditional solvers, at significantly faster speeds.1

artificial intelligence, fsnet, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
North America > United States > Massachusetts (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

RULE: Reinforcement UnLEarning Achieves Forget-retain Pareto Optimality

Neural Information Processing SystemsJun-16-2026, 08:48:55 GMT

This has led to increasing interest in LLM unlearning: the task of selectively removing specific information from a model without retraining from scratch or degrading overall utility. However, existing methods often rely on large-scale forget and retain datasets, and suffer from unnatural responses, poor generalization, or catastrophic utility loss. In this work, we propose Reinforcement UnLEarning (RULE), an efficient framework that formulates unlearning as a refusal boundary optimization problem. RULE is trained with a small portion of forget set and synthesized boundary queries, using a verifiable reward function that encourages safe refusal on forget-related queries while preserving helpful responses on permissible inputs. We provide both theoretical and empirical evidence demonstrating the effectiveness of RULE in achieving targeted unlearning without compromising model utility. Experimental results show that, with only 12% forget set and 8% synthesized boundary data, RULE outperforms existing baselines by up to 17.5% forget quality and 16.3% naturalness response while maintaining general utility, achieving forget-retain Pareto optimality. Remarkably, we further observe that RULE improves the naturalness of model outputs, enhances training efficiency, and exhibits strong generalization ability, generalizing refusal behavior to semantically related but unseen queries.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Asia > China (0.46)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Nonlinearly Preconditioned Gradient Methods: Momentum and Stochastic Analysis

Neural Information Processing SystemsJun-16-2026, 07:48:25 GMT

We study nonlinearly preconditioned gradient methods for smooth nonconvex optimization problems, focusing on sigmoid preconditioners that inherently perform a form of gradient clipping akin to the widely used gradient clipping technique. Building upon this idea, we introduce a novel heavy ball-type algorithm and provide convergence guarantees under a generalized smoothness condition that is less restrictive than traditional Lipschitz smoothness, thus covering a broader class of functions. Additionally, we develop a stochastic variant of the base method and study its convergence properties under different noise assumptions. We compare the proposed algorithms with baseline methods on diverse tasks from machine learning including neural network training.

artificial intelligence, inequality, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ClusterFusion: Expanding Operator Fusion Scope for LLMInference via Cluster-Level Collective Primitive

Neural Information Processing SystemsJun-16-2026, 07:41:13 GMT

Large language model (LLM) decoding suffers from high latency due to fragmented execution across operators and heavy reliance on off-chip memory for data exchange and reduction. This execution model limits opportunities for fusion and incurs significant memory traffic and kernel launch overhead. While modern architectures such as NVIDIA Hopper provide distributed shared memory and low-latency intra-cluster interconnects, they expose only low-level data movement instructions, lacking structured abstractions for collective on-chip communication. To bridge this software-hardware gap, we introduce two cluster-level communication primitives, ClusterReduce and ClusterGather, which abstract common communication patterns and enable structured, high-speed data exchange and reduction between thread blocks within a cluster, allowing intermediate results to be on-chip without involving off-chip memory. Building on these abstractions, we design ClusterFusion, an execution framework that schedules communication and computation jointly to expand operator fusion scope by composing decoding stages such as QKVProjection, Attention, and Output Projection into a single fused kernels. Evaluations on H100 GPUs show that ClusterFusion outperforms state-of-the-art inference frameworks by 1.61 on average in end-to-end latency across different models and configurations.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.68)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adversary Aware Optimization for Robust Defense

Neural Information Processing SystemsJun-16-2026, 07:00:01 GMT

Deep neural networks remain highly susceptible to adversarial attacks, where small, subtle perturbations to input images may induce misclassification. We propose a novel optimization-based purification framework that directly removes these perturbations by maximizing a Bayesian-inspired objective combining a pretrained diffusion prior with a likelihood term tailored to the adversarial perturbation space. Our method iteratively refines a given input through gradient-based updates of a combined score-based loss to guide the purification process. Unlike existing optimization-based defenses that treat adversarial noise as generic corruption, our approach explicitly integrates the adversarial landscape into the objective. Experiments performed on CIFAR-10 and CIFAR-100 demonstrate strong robust accuracy against a range of common adversarial attacks. Our work offers a principled testtime defense grounded in probabilistic inference using score-based generative models.

artificial intelligence, diffusion model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (0.68)
Government (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)

Add feedback

Localized Data Shapley: Accelerating Valuation for Nearest Neighbor Algorithms

Neural Information Processing SystemsJun-16-2026, 06:26:27 GMT

Data Shapley values provide a principled approach for quantifying the contribution of individual training examples to machine learning models. However, computing these values often requires computational complexity that is exponential in the data size, and this has led researchers to pursue efficient algorithms tailored to specific machine learning models. Building on the prior success of the Shapley valuation for K-nearest neighbor (KNN) models, in this paper, we introduce a localized data Shapley framework that significantly accelerates the valuation of data points.

artificial intelligence, machine learning, ztest, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
North America > United States (0.46)
Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

ModHiFi: Identifying High Fidelity predictive components for Model Modification

Neural Information Processing SystemsJun-16-2026, 06:09:19 GMT

Open weight models, which are ubiquitous, rarely provide access to their training data or loss function. This makes modifying such models for tasks such as pruning or unlearning, which are constrained by this unavailability, an active area of research. Existing techniques typically require gradients or ground-truth labels, rendering them infeasible in settings with limited computational resources. In this work, we investigate the fundamental question of identifying components that are critical to the model's predictive performance, without access to either gradients or the loss function, and with only distributional access such as synthetic data. We theoretically demonstrate that the global error is linearly bounded by local reconstruction errors for Lipschitz-continuous networks such as CNNs and well-trained Transformers (which, contrary to existing literature, we find exhibit Lipschitz continuity). This motivates using the locally reconstructive behavior of component subsets to quantify their global importance, via a metric that we term Subset Fidelity. In the uncorrelated features setting, selecting individual components based on their Subset Fidelity scores is optimal, which we utilize to propose ModHiFi, an algorithm for model modification that requires neither training data nor access to a loss function. ModHiFi-P, for structured pruning, achieves an 11% speedup over the current state of the art on ImageNet models and competitive performance on language models. ModHiFi-U, for classwise unlearning, achieves complete unlearning on CIFAR-10 without fine-tuning and demonstrates competitive performance on Swin Transformers.2

large language model, machine learning, pruning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Austria (0.27)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)

Industry:

Information Technology > Security & Privacy (0.92)
Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

Graph Your Own Prompt

Neural Information Processing SystemsJun-16-2026, 06:08:02 GMT

We propose Graph Consistency Regularization (GCR), a novel framework that injects relational graph structures, derived from model predictions, into the learning process to promote class-aware, semantically meaningful feature representations. Functioning as a form of self-prompting, GCR enables the model to refine its internal structure using its own outputs. While deep networks learn rich representations, these often capture noisy inter-class similarities that contradict the model's predicted semantics.

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Transportation > Ground > Road (0.67)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

AI could help win 'race against extinction' of vital plants, say botanists

The GuardianJun-16-2026, 06:00:48 GMT

A botanist at Kew's Madagascar research site scans a plant for digitisation. A botanist at Kew's Madagascar research site scans a plant for digitisation. AI could help win'race against extinction' of vital plants, say botanists Tech is helping to identify and save new specimens and could open'genomic goldmine' of fungi data The rise of AI and digitisation could be a turning point in the "race against extinction" faced by botanists trying to identify and save vital plants before they vanish, according to a major report from Royal Botanic Gardens, Kew. New technology is enabling scientists to track how flowering times have shifted by weeks around the world, rapidly identify new specimens and even get crucial genetic data from 180-year-old fungus specimens, potentially opening a "genomic goldmine". Digitisation and online access to millions of specimens that were until now only accessible in archives is also producing new insights, especially in the global south.

artificial intelligence, fungus, social media, (15 more...)

The Guardian

Country: