AITopics | continuous relaxation

MixAT: Combining Continuous and Discrete Adversarial Training for LLMs

Neural Information Processing SystemsJun-12-2026, 23:18:54 GMT

Despite recent efforts in Large Language Model (LLM) safety and alignment, current adversarial attacks on frontier LLMs can still consistently force harmful generations. Although adversarial training has been widely studied and shown to significantly improve the robustness of traditional machine learning models, its strengths and weaknesses in the context of LLMs are less understood. Specifically, while existing discrete adversarial attacks are effective at producing harmful content, training LLMs with concrete adversarial prompts is often computationally expensive, leading to reliance on continuous relaxations. At the same time, despite their effectiveness and generalization capabilities, training with continuous perturbations does not always capture the full spectrum of vulnerabilities exploited by discrete attacks. In this work, we aim to bridge this gap by introducing MIXAT, a novel method that combines stronger discrete and faster continuous attacks during training. We rigorously evaluate MIXAT across a wide spectrum of state-of-the-art attacks, proposing the *At Least One Attack Success Rate* (ALO-ASR) metric to capture the worst-case vulnerability of models. We show MIXAT achieves substantially better robustness (ALO-ASR $ < 20\%$) compared to prior defenses (ALO-ASR $> 50\%$), while maintaining a runtime comparable to methods based on continuous relaxations. We further analyze MIXAT in realistic deployment settings, exploring how chat templates, quantization, low-rank adapters, and temperature affect both adversarial training and evaluation, revealing additional blind spots in current methodologies. Our results demonstrate that MIXAT discrete-continuous defense offers a principled and superior robustness-accuracy tradeoff with minimal computational overhead, highlighting its promise for building safer LLMs.

artificial intelligence, large language model, natural language, (11 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Coupled Gradient Estimators for Discrete Latent Variables

Neural Information Processing SystemsApr-27-2026, 03:17:05 GMT

Training models with discrete latent variables is challenging due to the high variance of unbiased gradient estimators. While low-variance reparameterization gradients of a continuous relaxation can provide an effective solution, a continuous relaxation is not always available or tractable. Dong et al. (2020) and Yin et al. (2020) introduced a performant estimator that does not rely on continuous relaxations; however, it is limited to binary random variables. We introduce a novel derivation of their estimator based on importance sampling and statistical couplings, which we extend to the categorical setting. Motivated by the construction of a stick-breaking coupling, we introduce gradient estimators based on reparameterizing categorical variables as sequences of binary variables and Rao-Blackwellization. In systematic experiments, we show that our proposed categorical gradient estimators provide state-of-the-art performance, whereas even with additional Rao-Blackwellization, previous estimators (Yin et al., 2019) underperform a simpler REINFORCE with a leave-one-out-baseline estimator (Kool et al., 2019).

artificial intelligence, estimator, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

RecurrentKernelNetworks

Neural Information Processing SystemsFeb-14-2026, 10:29:17 GMT

However,whenlargeamounts ofannotated dataareavailable, models thatallow end-to-end training such as neural networks are often preferred.

artificial intelligence, kernel, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

d827f12e35eae370ba9c65b7f6026695-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 10:07:08 GMT

algorithm, map inference problem, objective function value, (14 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Industry: Information Technology (0.48)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

CoupledGradientEstimatorsforDiscreteLatent Variables

Neural Information Processing SystemsFeb-11-2026, 05:32:41 GMT

While low-variance reparameterization gradients of a continuous relaxation can provide an effective solution, a continuous relaxation is not always available or tractable.

artificial intelligence, estimator, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

ac10ec1ace51b2d973cd87973a98d3ab-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 19:02:39 GMT

algorithm, dasgupta, yp hc, (15 more...)

Neural Information Processing Systems

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
(3 more...)

Industry:

Information Technology (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

90c34175923a36ab7a5de4b981c1972f-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 08:35:25 GMT

discrete distribution, relaxation, reparameterization trick, (12 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Proofs and Derivations

Neural Information Processing SystemsFeb-9-2026, 05:45:59 GMT

B.7 Additional Details on the Running Time In this section, we provide additional details on the running time of the algorithms.

artificial intelligence, equation, optimization problem, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.31)

Add feedback

LearningtoExecuteProgramswith InstructionPointerAttentionGraphNeuralNetworks

Neural Information Processing SystemsFeb-8-2026, 15:45:09 GMT

Graph neural networks (GNNs) have emerged as a powerful tool for learning softwareengineering tasksincluding codecompletion, bugfinding,andprogram repair. The IPA-GNN can be seen either as a continuous relaxation of the RNN model or as a GNN variant more tailored to execution.

artificial intelligence, branch decision, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Latent Template Induction with Gumbel-CRFs

Neural Information Processing SystemsDec-24-2025, 20:14:20 GMT

Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators. As a structured inference network, we show that it learns interpretable templates during training, which allows us to control the decoder during testing. We demonstrate the effectiveness of our methods with experiments on data-to-text generation and unsupervised paraphrase generation.

gumbel-crf, latent template induction, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Filters

Collaborating Authors

continuous relaxation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

MixAT: Combining Continuous and Discrete Adversarial Training for LLMs

Coupled Gradient Estimators for Discrete Latent Variables

RecurrentKernelNetworks

d827f12e35eae370ba9c65b7f6026695-Paper.pdf

CoupledGradientEstimatorsforDiscreteLatent Variables

ac10ec1ace51b2d973cd87973a98d3ab-Paper.pdf

90c34175923a36ab7a5de4b981c1972f-Paper.pdf

A Proofs and Derivations

LearningtoExecuteProgramswith InstructionPointerAttentionGraphNeuralNetworks

Latent Template Induction with Gumbel-CRFs