AITopics

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularization

Neural Information Processing SystemsMay-28-2025, 21:17:18 GMT

Adversarial contrastive learning (ACL) is a technique that enhances standard contrastive learning (SCL) by incorporating adversarial data to learn a robust representation that can withstand adversarial attacks and common corruptions without requiring costly annotations. To improve transferability, the existing work introduced the standard invariant regularization (SIR) to impose style-independence property to SCL, which can exempt the impact of nuisance style factors in the standard representation. However, it is unclear how the style-independence property benefits ACL-learned robust representations. In this paper, we leverage the technique of causal reasoning to interpret the ACL and propose adversarial invariant regularization (AIR) to enforce independence from style factors. We regulate the ACL using both SIR and AIR to output the robust representation. Theoretically, we show that AIR implicitly encourages the representational distance between different views of natural data and their adversarial variants to be independent of style factors. Empirically, our experimental results show that invariant regularization significantly improves the performance of state-of-the-art ACL methods in terms of both standard generalization and robustness on downstream tasks. To the best of our knowledge, we are the first to apply causal reasoning to interpret ACL and develop AIR for enhancing ACL-learned robust representations.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

2e255d2d6bf9bb33030246d31f1a79ca-AuthorFeedback.pdf

Neural Information Processing SystemsMay-28-2025, 21:16:41 GMT

Reviewer 1 We thank the reviewer for their helpful comments. The reviewer's primary concern is this work's relationship with existing literature. Our paper includes a detailed comparison to Finlay et al. 2020 that addresses the reviewer's concerns about our Our regularizer, like theirs, is augmenting the loss with an extra penalty term. Empirically, we extensively compare to Finlay et al. 2020 in our experiments on As we explain in Section 6.2, we agree that K = 3 with a 3rd order solver shows a marginal improvement over other Reviewer 2 We will remove the qualifiers on l. 30 and add more detail on l. 47. We will include the detail on l. 64.

artificial intelligence, reviewer, trajectory, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.49)

Add feedback

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers

Neural Information Processing SystemsMay-28-2025, 21:16:22 GMT

Despite extensive research on adversarial training strategies to improve robustness, the decisions of even the most robust deep learning models can still be quite sensitive to imperceptible perturbations, creating serious risks when deploying them for high-stakes real-world applications. While detecting such cases may be critical, evaluating a model's vulnerability at a per-instance level using adversarial attacks is computationally too intensive and unsuitable for real-time deployment scenarios. The input space margin is the exact score to detect non-robust samples and is intractable for deep neural networks. This paper introduces the concept of margin consistency - a property that links the input space margins and the logit margins in robust models - for efficient detection of vulnerable samples. First, we establish that margin consistency is a necessary and sufficient condition to use a model's logit margin as a score for identifying non-robust samples. Next, through comprehensive empirical analysis of various robustly trained models on CIFAR10 and CIFAR100 datasets, we show that they indicate high margin consistency with a strong correlation between their input space margins and the logit margins. Then, we show that we can effectively and confidently use the logit margin to detect brittle decisions with such models. Finally, we address cases where the model is not sufficiently margin-consistent by learning a pseudo-margin from the feature representation. Our findings highlight the potential of leveraging deep representations to assess adversarial vulnerability in deployment scenarios efficiently.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

On conditional diffusion models for PDE simulations

Neural Information Processing SystemsMay-28-2025, 21:16:10 GMT

Modelling partial differential equations (PDEs) is of crucial importance in science and engineering, and it includes tasks ranging from forecasting to inverse problems, such as data assimilation. However, most previous numerical and machine learning approaches that target forecasting cannot be applied out-of-the-box for data assimilation. Recently, diffusion models have emerged as a powerful tool for conditional generation, being able to flexibly incorporate observations without retraining. In this work, we perform a comparative study of score-based diffusion models for forecasting and assimilation of sparse observations. In particular, we focus on diffusion models that are either trained in a conditional manner, or conditioned after unconditional training. We address the shortcomings of existing models by proposing 1) an autoregressive sampling approach, that significantly improves performance in forecasting, 2) a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths, and 3) a hybrid model which employs flexible pre-training conditioning on initial conditions and flexible posttraining conditioning to handle data assimilation. We empirically show that these modifications are crucial for successfully tackling the combination of forecasting and data assimilation, a task commonly encountered in real-world scenarios.

artificial intelligence, machine learning, trajectory, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Energy (1.00)
Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Fast and Regret Optimal Best Arm Identification: Fundamental Limits and Low-Complexity Algorithms

Neural Information Processing SystemsMay-28-2025, 21:15:57 GMT

This paper considers a stochastic Multi-Armed Bandit (MAB) problem with dual objectives: (i) quick identification and commitment to the optimal arm, and (ii) reward maximization throughout a sequence of T consecutive rounds. Though each objective has been individually well-studied, i.e., best arm identification for (i) and regret minimization for (ii), the simultaneous realization of both objectives remains an open problem, despite its practical importance. This paper introduces Regret Optimal Best Arm Identification (ROBAI) which aims to achieve these dual objectives.

data mining, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States > Michigan (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.51)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Structural Pruning for Diffusion Models -- Supplementary Materials -- Gongfan Fang Xinyin Ma Xinchao Wang

Neural Information Processing SystemsMay-28-2025, 21:09:36 GMT

In this document, we provide supplementary materials that we cannot fit into the main manuscript due to the page limit. It includes detailed explanations, visualization results, and several quantitative experiments. This section provides further insights into the coupled structures present in U-Net, which function as denoisers in diffusion models. In the context of structural pruning, it is crucial to prune layers with interdependencies simultaneously to avoid any potential structural issues [3]. To address these dependencies within U-Net, we leverage the use of DepGraph [1], which effectively handles most of the interdependencies.

artificial intelligence, machine learning, pre-trained model, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Structural Pruning for Diffusion Models

Neural Information Processing SystemsMay-28-2025, 21:09:34 GMT

Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs). The impressive capability of these models, however, often entails significant computational overhead during both training and inference. To tackle this challenge, we present Diff-Pruning, an efficient compression method tailored for learning lightweight diffusion models from pre-existing ones, without the need for extensive re-training. The essence of Diff-Pruning is encapsulated in a Taylor expansion over pruned timesteps, a process that disregards non-contributory diffusion steps and ensembles informative gradients to identify important weights. Our empirical assessment, undertaken across several datasets highlights two primary benefits of our proposed method: 1) Efficiency: it enables approximately a 50% reduction in FLOPs at a mere 10% to 20% of the original training expenditure; 2) Consistency: the pruned diffusion models inherently preserve generative behavior congruent with their pre-trained models.

artificial intelligence, diffusion model, machine learning, (12 more...)

Neural Information Processing Systems

Country:

Europe > Germany (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization

Neural Information Processing SystemsMay-28-2025, 21:09:22 GMT

Recent research indicates that large language models (LLMs) are susceptible to jailbreaking attacks that can generate harmful content. This paper introduces a novel token-level attack method, Adaptive Dense-to-Sparse Constrained Optimization (ADC), which has been shown to successfully jailbreak multiple opensource LLMs.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

29496c942ed6e08ecc469f4521ebfff0-Paper-Conference.pdf

Neural Information Processing SystemsMay-28-2025, 21:09:00 GMT

Weight decay is a broadly used technique for training state-of-the-art deep networks from image classification to large language models. Despite its widespread usage and being extensively studied in the classical literature, its role remains poorly understood for deep learning. In this work, we highlight that the role of weight decay in modern deep learning is different from its regularization effect studied in classical learning theory. For deep networks on vision tasks trained with multipass SGD, we show how weight decay modifies the optimization dynamics enhancing the ever-present implicit regularization of SGD via the loss stabilization mechanism. In contrast, for large language models trained with nearly one-epoch training, we describe how weight decay balances the bias-variance tradeoff in stochastic optimization leading to lower training loss and improved training stability. Overall, we present a unifying perspective from ResNets on vision tasks to LLMs: weight decay is never useful as an explicit regularizer but instead changes the training dynamics in a desirable way. The code is available at https://github.com/

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Europe > Switzerland (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology: