AITopics | Europe

Collaborating Authors

Europe

GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs

Neural Information Processing SystemsJun-22-2026, 22:11:44 GMT

LLMs have shown impressive capabilities across various natural language processing tasks, yet remain vulnerable to input prompts, known as jailbreak attacks, carefully designed to bypass safety guardrails and elicit harmful responses. Traditional methods rely on manual heuristics but suffer from limited generalizability. Despite being automatic, optimization-based attacks often produce unnatural prompts that can be easily detected by safety filters or require high computational costs due to discrete token optimization. In this paper, we introduce Generative Adversarial Suffix Prompter (GASP), a novel automated framework that can efficiently generate human-readable jailbreak prompts in a fully black-box setting. In particular, GASP leverages latent Bayesian optimization to craft adversarial suffixes by efficiently exploring continuous latent embedding spaces, gradually optimizing the suffix prompter to improve attack efficacy while balancing prompt coherence via a targeted iterative refinement procedure. Through comprehensive experiments, we show that GASP can produce natural adversarial prompts, significantly improving jailbreak success over baselines, reducing training times, and accelerating inference speed, thus making it an efficient and scalable solution for red-teaming LLMs. Warning: This paper contains text and examples that may be considered offensive or harmful.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Instructional Material (0.67)
Research Report > New Finding (0.67)

Industry:

Transportation (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mask Image Watermarking

Neural Information Processing SystemsJun-22-2026, 22:11:14 GMT

MaskWM has two variants: (1) MaskWM-D, which supports global watermark embedding, watermark localization, and local watermark extraction for applications such as tamper detection; (2) MaskWM-ED, which focuses on local watermark embedding and extraction, offering enhanced robustness in small regions to support fine-grined image protection. MaskWM-D builds on the classical encoder-distortion layer-decoder training paradigm. In MaskWM-D, we introduce a simple masking mechanism during the decoding stage that enables both global and local watermark extraction. During training, the decoder is guided by various types of masks applied to watermarked images before extraction, helping it learn to localize watermarks and extract them from the corresponding local areas. MaskWM-ED extends this design by incorporating the mask into the encoding stage as well, guiding the encoder to embed the watermark in designated local regions, which improves robustness under regional attacks.

artificial intelligence, machine learning, watermark, (18 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Pay Attention to Small Weights

Neural Information Processing SystemsJun-22-2026, 22:04:24 GMT

Finetuning large pretrained neural networks is known to be resource-intensive, both in terms of memory and computational cost. To mitigate this, a common approach is to restrict training to a subset of the model parameters. By analyzing the relationship between gradients and weights during finetuning, we observe a notable pattern: large gradients are often associated with small-magnitude weights. This correlation is more pronounced in finetuning settings than in training from scratch. Motivated by this observation, we propose NANOADAM, which dynamically updates only the small-magnitude weights during finetuning and offers several practical advantages: first, the criterion is gradient-free--the parameter subset can be determined without gradient computation; second, it preserves large-magnitude weights, which are likely to encode critical features learned during pretraining, thereby reducing the risk of catastrophic forgetting; thirdly, it permits the use of larger learning rates and consistently leads to better generalization performance in experiments. We demonstrate this for both NLP and vision tasks.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)
Overview (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

On the Value of Cross-Modal Misalignment in Multimodal Representation Learning

Neural Information Processing SystemsJun-22-2026, 22:02:59 GMT

Multimodal representation learning, exemplified by multimodal contrastive learning (MMCL) using image-text pairs, aims to learn powerful representations by aligning cues across modalities. This approach relies on the core assumption that the exemplar image-text pairs constitute two representations of an identical concept. However, recent research has revealed that real-world datasets often exhibit cross-modal misalignment. There are two distinct viewpoints on how to address this issue: one suggests mitigating the misalignment, and the other leveraging it. We seek here to reconcile these seemingly opposing perspectives, and to provide a practical guide for practitioners. Using latent variable models we thus formalize cross-modal misalignment by introducing two specific mechanisms: Selection bias, where some semantic variables are absent in the text, and perturbation bias, where semantic variables are altered--both leading to misalignment in data pairs. Our theoretical analysis demonstrates that, under mild assumptions, the representations learned by MMCL capture exactly the information related to the subset of the semantic variables invariant to selection and perturbation biases. This provides a unified perspective for understanding misalignment. Based on this, we further offer actionable insights into how misalignment should inform the design of real-world ML systems.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
Oceania > Australia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Overview (0.67)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Efficient Kernelized Learning in Polyhedral Games beyond Full Information: From Colonel Blotto to Congestion Games

Neural Information Processing SystemsJun-22-2026, 22:02:51 GMT

We examine the problem of efficiently learning coarse correlated equilibria (CCE) in polyhedral games, that is, normal-form games with an exponentially large number of actions per player and an underlying combinatorial structure--such as the classic Colonel Blotto game or congestion games. Achieving computational efficiency in this setting requires learning algorithms whose regret and per-iteration complexity scale at most polylogarithmically with the size of the players' action sets. This challenge has recently been addressed in the full-information setting, primarily through the use of kernelization; however, in the more realistic partial information setting, the situation is much more challenging, and existing approaches result in suboptimal and impractical runtime complexity to learn CCE. We address this gap via a novel kernelization-based framework for payoff-based learning in polyhedral games, which we then apply to certain key classes of polyhedral games--namely Colonel Blotto, graphic matroid and network congestion games. In so doing, we obtain a range of computationally efficient payoff-based learning algorithms which significantly improve upon prior work in terms of the runtime for learning CCE.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.27)

Genre: Research Report > Experimental Study (1.00)

Industry:

Government (0.94)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ALIGNVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding

Neural Information Processing SystemsJun-22-2026, 22:02:13 GMT

Aligning visual features with language embeddings is a key challenge in visionlanguage models (VLMs). The performance of such models hinges on having a good connector that maps visual features generated by a vision encoder to a shared embedding space with the LLM while preserving semantic similarity. Existing connectors, such as multilayer perceptrons (MLPs), lack inductive bias to constrain visual features within the linguistic structure of the LLM's embedding space, making them data-hungry and prone to cross-modal misalignment. In this work, we propose a novel vision-text alignment method, ALIGNVLM, that maps visual features to a weighted average of LLM text embeddings. Our approach leverages the linguistic priors encoded by the LLM to ensure that visual features are mapped to regions of the space that the LLM can effectively interpret. ALIGNVLM is particularly effective for document understanding tasks, where visual and textual modalities are highly correlated. Our extensive experiments show that ALIGNVLM achieves state-of-the-art performance compared to prior alignment methods, with larger gains on document understanding tasks and under low-resource setups. We provide further analysis demonstrating its efficiency and robustness to noise.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Steering Generative Models with Experimental Data for Protein Fitness Optimization

Neural Information Processing SystemsJun-22-2026, 21:58:30 GMT

Protein fitness optimization involves finding a protein sequence that maximizes desired quantitative properties in a combinatorially large design space of possible sequences. Recent advances in steering protein generative models (e.g., diffusion models and language models) with labeled data offer a promising approach. However, most previous studies have optimized surrogate rewards and/or utilized large amounts of labeled data for steering, making it unclear how well existing methods perform and compare to each other in real-world optimization campaigns where fitness is measured through low-throughput wet-lab assays. In this study, we explore fitness optimization using small amounts (hundreds) of labeled sequencefitness pairs and comprehensively evaluate strategies such as classifier guidance and posterior sampling for guiding generation from different discrete diffusion models of protein sequences. We also demonstrate how guidance can be integrated into adaptive sequence selection akin to Thompson sampling in Bayesian optimization, showing that plug-and-play guidance strategies offer advantages over alternatives such as reinforcement learning with protein language models. Overall, we provide practical insights into how to effectively steer modern generative models for next-generation protein fitness optimization.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.93)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Class-wise Balancing Data Replay for Federated Class-Incremental Learning

Neural Information Processing SystemsJun-22-2026, 21:58:03 GMT

Federated Class Incremental Learning (FCIL) aims to collaboratively process continuously increasing incoming tasks across multiple clients. Among various approaches, data replay has become a promising solution, which can alleviate forgetting by reintroducing representative samples from previous tasks. However, their performance is typically limited by class imbalance, both within the replay buffer due to limited global awareness and between replayed and newly arrived classes. To address this issue, we propose a class-wise balancing data replay method for FCIL (FedCBDR), which employs a global coordination mechanism for class-level memory construction and reweights the learning objective to alleviate the aforementioned imbalances. Specifically, FedCBDRhas two key components: 1) the global-perspective data replay module reconstructs global representations of prior task in a privacy-preserving manner, which then guides a class-aware and importance-sensitive sampling strategy to achieve balanced replay; 2) Subsequently, to handle class imbalance across tasks, the task-aware temperature scaling module adaptively adjusts the temperature of logits at both class and instance levels based on task dynamics, which reduces the model's overconfidence in majority classes while enhancing its sensitivity to minority classes. Experimental results verified that FedCBDR achieves balanced class-wise sampling under heterogeneous data distributions and improves generalization under task imbalance between earlier and recent tasks, yielding a 2%-15% Top-1 accuracy improvement over six stateof-the-art methods.

artificial intelligence, learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
Asia (0.28)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Credal Prediction based on Relative Likelihood

Neural Information Processing SystemsJun-22-2026, 21:57:06 GMT

Predictions in the form of sets of probability distributions, so-called credal sets, provide a suitable means to represent a learner's epistemic uncertainty. In this paper, we propose a theoretically grounded approach to credal prediction based on the statistical notion of relative likelihood: The target of prediction is the set of all (conditional) probability distributions produced by the collection of plausible models, namely those models whose relative likelihood exceeds a specified threshold. This threshold has an intuitive interpretation and allows for controlling the trade-off between correctness and precision of credal predictions. We tackle the problem of approximating credal sets defined in this way by means of suitably modified ensemble learning techniques. To validate our approach, we illustrate its effectiveness by experiments on benchmark datasets demonstrating superior uncertainty representation without compromising predictive performance. We also compare our method against several state-of-the-art baselines in credal prediction.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > New York (0.28)
North America > Canada > Alberta (0.28)
North America > United States > California (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

SONAR: Long-Range Graph Propagation Through Information Waves

Neural Information Processing SystemsJun-22-2026, 21:56:28 GMT

Capturing effective long-range information propagation remains a fundamental yet challenging problem in graph representation learning. Motivated by this, we introduce SONAR, a novel GNN architecture inspired by the dynamics of wave propagation in continuous media. SONAR models information flow on graphs as oscillations governed by the wave equation, allowing it to maintain effective propagation dynamics over long distances. By integrating adaptive edge resistances and state-dependent external forces, our method balances conservative and nonconservative behaviors, improving the ability to learn more complex dynamics. We provide a rigorous theoretical analysis of SONAR's energy conservation and information propagation properties, demonstrating its capacity to address the long-range propagation problem. Extensive experiments on synthetic and realworld benchmarks confirm that SONAR achieves state-of-the-art performance, particularly on tasks requiring long-range information exchange.

data mining, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: