AITopics | cpe

Collaborating Authors

cpe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6a12d7ebc27cae44623468302c47ad74-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 04:52:32 GMT

cpe, data augmentation, posterior, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.47)

Add feedback

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect

Neural Information Processing SystemsDec-24-2025, 06:12:06 GMT

The "cold posterior effect" (CPE) in Bayesian deep learning describes the disturbing observation that the predictive performance of Bayesian neural networks can be significantly improved if the Bayes posterior is artificially sharpened using a temperature parameter T <1. The CPE is problematic in theory and practice and since the effect was identified many researchers have proposed hypotheses to explain the phenomenon. However, despite this intensive research effort the effect remains poorly understood. In this work we provide novel and nuanced evidence relevant to existing explanations for the cold posterior effect, disentangling three hypotheses: 1. The dataset curation hypothesis of Aitchison (2020): we show empirically that the CPE does not arise in a real curated data set but can be produced in a controlled experiment with varying curation strength.

disentangling, hypothesis, name change, (8 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

Amortized Active Generation of Pareto Sets

Steinberg, Daniel M., Wijesinghe, Asiri, Oliveira, Rafael, Koniusz, Piotr, Ong, Cheng Soon, Bonilla, Edwin V.

arXiv.org Machine LearningNov-4-2025

We introduce active generation of Pareto sets (A-GPS), a new framework for online discrete black-box multi-objective optimization (MOO). A-GPS learns a generative model of the Pareto set that supports a-posteriori conditioning on user preferences. The method employs a class probability estimator (CPE) to predict non-dominance relations and to condition the generative model toward high-performing regions of the search space. We also show that this non-dominance CPE implicitly estimates the probability of hypervolume improvement (PHVI). To incorporate subjective trade-offs, A-GPS introduces preference direction vectors that encode user-specified preferences in objective space. At each iteration, the model is updated using both Pareto membership and alignment with these preference directions, producing an amortized generative model capable of sampling across the Pareto front without retraining. The result is a simple yet powerful approach that achieves high-quality Pareto set approximations, avoids explicit hypervolume computation, and flexibly captures user preferences. Empirical results on synthetic benchmarks and protein design tasks demonstrate strong sample efficiency and effective preference incorporation.

artificial intelligence, machine learning, optimization problem, (20 more...)

arXiv.org Machine Learning

2510.21052

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia > New South Wales (0.04)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect Lorenzo Noci Dept of Computer Science ETH Zürich lorenzo.noci@inf.ethz.ch Kevin Roth

Neural Information Processing SystemsOct-9-2025, 15:49:25 GMT

The " cold posterior effect " (CPE) in Bayesian deep learning describes the uncom-forting observation that the predictive performance of Bayesian neural networks

cpe, data augmentation, posterior, (12 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.41)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.47)

Add feedback

Centralized Permutation Equivariant Policy for Cooperative Multi-Agent Reinforcement Learning

Xu, Zhuofan, Bollig, Benedikt, Függer, Matthias, Nowak, Thomas, Dréau, Vincent Le

arXiv.org Artificial IntelligenceAug-19-2025

The Centralized Training with Decentralized Execution (CTDE) paradigm has gained significant attention in multi-agent reinforcement learning (MARL) and is the foundation of many recent algorithms. However, decentralized policies operate under partial observability and often yield suboptimal performance compared to centralized policies, while fully centralized approaches typically face scalability challenges as the number of agents increases. We propose Centralized Permutation Equivariant (CPE) learning, a centralized training and execution framework that employs a fully centralized policy to overcome these limitations. Our approach leverages a novel permutation equivariant architecture, Global-Local Permutation Equivariant (GLPE) networks, that is lightweight, scalable, and easy to implement. Experiments show that CPE integrates seamlessly with both value decomposition and actor-critic methods, substantially improving the performance of standard CTDE algorithms across cooperative benchmarks including MPE, SMAC, and RWARE, and matching the performance of state-of-the-art RWARE implementations.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2508.11706

Country:

North America > United States (1.00)
Europe (0.68)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer

Yining Ma, Jingwen Li, Zhiguang Cao, Wen Song, Le Zhang, Zhenghua Chen, Jing Tang

Neural Information Processing SystemsAug-14-2025, 17:12:38 GMT

correlation, dact, transformer, (12 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.46)

Industry: Transportation (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate

Lee, Byung Hyun, Lim, Sungjin, Lee, Seunggyu, Kang, Dong Un, Chun, Se Young

arXiv.org Artificial IntelligenceJul-1-2025

Remarkable progress in text-to-image diffusion models has brought a major concern about potentially generating images on inappropriate or trademarked concepts. Concept erasing has been investigated with the goals of deleting target concepts in diffusion models while preserving other concepts with minimal distortion. To achieve these goals, recent concept erasing methods usually fine-tune the cross-attention layers of diffusion models. In this work, we first show that merely updating the cross-attention layers in diffusion models, which is mathematically equivalent to adding \emph{linear} modules to weights, may not be able to preserve diverse remaining concepts. Then, we propose a novel framework, dubbed Concept Pinpoint Eraser (CPE), by adding \emph{nonlinear} Residual Attention Gates (ResAGs) that selectively erase (or cut) target concepts while safeguarding remaining concepts from broad distributions by employing an attention anchoring loss to prevent the forgetting. Moreover, we adversarially train CPE with ResAG and learnable text embeddings in an iterative manner to maximize erasing performance and enhance robustness against adversarial attacks. Extensive experiments on the erasure of celebrities, artistic styles, and explicit contents demonstrated that the proposed CPE outperforms prior arts by keeping diverse remaining concepts while deleting the target concepts with robustness against attack prompts. Code is available at https://github.com/Hyun1A/CPE

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.22806

Country:

North America > United States (0.92)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.45)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CoE: Chain-of-Explanation via Automatic Visual Concept Circuit Description and Polysemanticity Quantification

Yu, Wenlong, Wang, Qilong, Liu, Chuang, Li, Dong, Hu, Qinghua

arXiv.org Artificial IntelligenceMar-19-2025

Explainability is a critical factor influencing the wide deployment of deep vision models (DVMs). Concept-based post-hoc explanation methods can provide both global and local insights into model decisions. However, current methods in this field face challenges in that they are inflexible to automatically construct accurate and sufficient linguistic explanations for global concepts and local circuits. Particularly, the intrinsic polysemanticity in semantic Visual Concepts (VCs) impedes the interpretability of concepts and DVMs, which is underestimated severely. In this paper, we propose a Chain-of-Explanation (CoE) approach to address these issues. Specifically, CoE automates the decoding and description of VCs to construct global concept explanation datasets. Further, to alleviate the effect of polysemanticity on model explainability, we design a concept polysemanticity disentanglement and filtering mechanism to distinguish the most contextually relevant concept atoms. Besides, a Concept Polysemanticity Entropy (CPE), as a measure of model interpretability, is formulated to quantify the degree of concept uncertainty. The modeling of deterministic concepts is upgraded to uncertain concept atom distributions. Finally, CoE automatically enables linguistic local explanations of the decision-making process of DVMs by tracing the concept circuit. GPT-4o and human-based experiments demonstrate the effectiveness of CPE and the superiority of CoE, achieving an average absolute improvement of 36% in terms of explainability scores.

large language model, machine learning, relevance value, (20 more...)

arXiv.org Artificial Intelligence

2503.15234

Country:

North America > United States (0.14)
Asia > China > Tianjin Province > Tianjin (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(2 more...)

Add feedback

On Cold Posteriors of Probabilistic Neural Networks: Understanding the Cold Posterior Effect and A New Way to Learn Cold Posteriors with Tight Generalization Guarantees

Zhang, Yijie

arXiv.org Machine LearningOct-20-2024

Bayesian inference provides a principled probabilistic framework for quantifying uncertainty by updating beliefs based on prior knowledge and observed data through Bayes' theorem. In Bayesian deep learning, neural network weights are treated as random variables with prior distributions, allowing for a probabilistic interpretation and quantification of predictive uncertainty. However, Bayesian methods lack theoretical generalization guarantees for unseen data. PAC-Bayesian analysis addresses this limitation by offering a frequentist framework to derive generalization bounds for randomized predictors, thereby certifying the reliability of Bayesian methods in machine learning. Temperature $T$, or inverse-temperature $\lambda = \frac{1}{T}$, originally from statistical mechanics in physics, naturally arises in various areas of statistical inference, including Bayesian inference and PAC-Bayesian analysis. In Bayesian inference, when $T < 1$ (``cold'' posteriors), the likelihood is up-weighted, resulting in a sharper posterior distribution. Conversely, when $T > 1$ (``warm'' posteriors), the likelihood is down-weighted, leading to a more diffuse posterior distribution. By balancing the influence of observed data and prior regularization, temperature adjustments can address issues of underfitting or overfitting in Bayesian models, bringing improved predictive performance.

artificial intelligence, machine learning, posterior, (15 more...)

arXiv.org Machine Learning

2410.1531

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect

Neural Information Processing SystemsOct-10-2024, 23:13:50 GMT

The "cold posterior effect" (CPE) in Bayesian deep learning describes the disturbing observation that the predictive performance of Bayesian neural networks can be significantly improved if the Bayes posterior is artificially sharpened using a temperature parameter T 1. The CPE is problematic in theory and practice and since the effect was identified many researchers have proposed hypotheses to explain the phenomenon. However, despite this intensive research effort the effect remains poorly understood. In this work we provide novel and nuanced evidence relevant to existing explanations for the cold posterior effect, disentangling three hypotheses: 1. The dataset curation hypothesis of Aitchison (2020): we show empirically that the CPE does not arise in a real curated data set but can be produced in a controlled experiment with varying curation strength. Our results demonstrate how the CPE can arise in isolation from synthetic curation, data augmentation, and bad priors.

cold posterior effect, disentangling, hypothesis, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback