AITopics | information-theoretic perspective

Collaborating Authors

information-theoretic perspective

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Novel Approach for Effective Multi-View Clustering with Information-Theoretic Perspective

Neural Information Processing SystemsDec-26-2025, 07:42:53 GMT

Multi-view clustering (MVC) is a popular technique for improving clustering performance using various data sources. However, existing methods primarily focus on acquiring consistent information while often neglecting the issue of redundancy across multiple views.This study presents a new approach called Sufficient Multi-View Clustering (SUMVC) that examines the multi-view clustering framework from an information-theoretic standpoint. Our proposed method consists of two parts. Firstly, we develop a simple and reliable multi-view clustering method SCMVC (simple consistent multi-view clustering) that employs variational analysis to generate consistent information. Secondly, we propose a sufficient representation lower bound to enhance consistent information and minimise unnecessary information among views. The proposed SUMVC method offers a promising solution to the problem of multi-view clustering and provides a new perspective for analyzing multi-view data. To verify the effectiveness of our model, we conducted a theoretical analysis based on the Bayes Error Rate, and experiments on multiple multi-view datasets demonstrate the superior performance of SUMVC.

effective multi-view clustering, information-theoretic perspective, novel approach, (5 more...)

Neural Information Processing Systems

Genre: Research Report > Promising Solution (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective

Laakom, Firas, Chen, Haobo, Schmidhuber, Jürgen, Bu, Yuheng

arXiv.org Artificial IntelligenceJun-10-2025

Despite substantial progress in promoting fairness in high-stake applications using machine learning models, existing methods often modify the training process, such as through regularizers or other interventions, but lack formal guarantees that fairness achieved during training will generalize to unseen data. Although overfitting with respect to prediction performance has been extensively studied, overfitting in terms of fairness loss has received far less attention. This paper proposes a theoretical framework for analyzing fairness generalization error through an information-theoretic lens. Our novel bounding technique is based on Efron-Stein inequality, which allows us to derive tight information-theoretic fairness generalization bounds with both Mutual Information (MI) and Conditional Mutual Information (CMI). Our empirical results validate the tightness and practical relevance of these bounds across diverse fairness-aware learning algorithms. Our framework offers valuable insights to guide the design of algorithms improving fairness generalization.

artificial intelligence, generalization error, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2506.07861

Country: Europe (0.27)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Information-Theoretic Perspectives on Optimizers

Tan, Zhiquan, Huang, Weiran

arXiv.org Artificial IntelligenceFeb-28-2025

The interplay of optimizers and architectures in neural networks is complicated and hard to understand why some optimizers work better on some specific architectures. In this paper, we find that the traditionally used sharpness metric does not fully explain the intricate interplay and introduces information-theoretic metrics called entropy gap to better help analyze. It is found that both sharpness and entropy gap affect the performance, including the optimization dynamic and generalization. We further use information-theoretic tools to understand a recently proposed optimizer called Lion and find ways to improve it.

entropy, optimizer, resnet18, (17 more...)

arXiv.org Artificial Intelligence

2502.20763

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

A Novel Approach for Effective Multi-View Clustering with Information-Theoretic Perspective

Neural Information Processing SystemsJan-19-2025, 14:07:57 GMT

effective multi-view clustering, information-theoretic perspective, multi-view clustering, (3 more...)

Neural Information Processing Systems

Genre:

Research Report > Promising Solution (0.55)
Overview > Innovation (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization

Shwartz-Ziv, Ravid, Balestriero, Randall, Kawaguchi, Kenji, Rudner, Tim G. J., LeCun, Yann

arXiv.org Artificial IntelligenceMar-6-2023

To do so, we first demonstrate how information-theoretic quantities can be obtained for deterministic networks as an alternative to the commonly used unrealistic stochastic networks assumption. Next, we relate the VICReg objective to mutual information maximization and use it to highlight the underlying assumptions of the objective. Based on this relationship, we derive a generalization bound for VICReg, providing generalization guarantees for downstream supervised learning tasks and present new self-supervised learning methods, derived from a mutual information maximization objective, that outperform existing methods in terms of performance.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2303.00633

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

[2303.00633v1] An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization

#artificialintelligenceMar-2-2023, 01:20:29 GMT

In this paper, we provide an information-theoretic perspective on Variance-Invariance-Covariance Regularization (VICReg) for self-supervised learning. To do so, we first demonstrate how information-theoretic quantities can be obtained for deterministic networks as an alternative to the commonly used unrealistic stochastic networks assumption. Next, we relate the VICReg objective to mutual information maximization and use it to highlight the underlying assumptions of the objective. Based on this relationship, we derive a generalization bound for VICReg, providing generalization guarantees for downstream supervised learning tasks and present new self-supervised learning methods, derived from a mutual information maximization objective, that outperform existing methods in terms of performance. This work provides a new information-theoretic perspective on self-supervised learning and Variance-Invariance-Covariance Regularization in particular and guides the way for improved transfer learning via information-theoretic self-supervised learning objectives.

information-theoretic perspective, variance-invariance-covariance regularization

#artificialintelligence

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

An information-theoretic perspective on intrinsic motivation in reinforcement learning: a survey

Aubret, Arthur, Matignon, Laetitia, Hassas, Salima

arXiv.org Artificial IntelligenceSep-19-2022

Traditionally, an agent maximizes a reward defined according to the task to perform: it may be a score when the agent learns to solve a game or a distance function when the agent learns to reach a goal. The reward is then considered as extrinsic (or as a feedback) because the reward function is provided expertly and specifically for the task. With an extrinsic reward, many spectacular results have been obtained on Atari game [Bellemare et al. 2015] with the Deep Q-network (DQN) [Mnih et al. 2015] through the integration of deep learning to RL, leading to deep reinforcement learning (DRL). However, despite the recent improvements of DRL approaches, they turn out to be most of the time unsuccessful when the rewards are scattered in the environment, as the agent is then unable to learn the desired behavior for the targeted task [Francois-Lavet et al. 2018]. Moreover, the behaviors learned by the agent are hardly reusable, both within the same task and across many different tasks [Francois-Lavet et al. 2018]. It is difficult for an agent to generalize the learnt skills to make high-level decisions in the environment. For example, such skill could be go to the door using primitive actions consisting in moving in the four cardinal directions; or even to move forward controlling different joints of a humanoid robot like in the robotic simulator MuJoCo [Todorov et al. 2012]. On another side, unlike RL, developmental learning [Cangelosi and Schlesinger 2018; Oudeyer and Smith 2016; Piaget and Cook 1952] is based on the trend that babies, or more broadly organisms, acquire new skill while spontaneously exploring their environment [Barto 2013; Gopnik et al. 1999].

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/e25020327

2209.0889

Country:

Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.86)

Add feedback

An Information-Theoretic Perspective on Overfitting and Underfitting

Bashir, Daniel, Montanez, George D., Sehra, Sonia, Segura, Pedro Sandoval, Lauw, Julius

arXiv.org Artificial IntelligenceNov-6-2020

We present an information-theoretic framework for understanding overfitting and underfitting in machine learning and prove the formal undecidability of determining whether an arbitrary classification algorithm will overfit a dataset. Measuring algorithm capacity via the information transferred from datasets to models, we consider mismatches between algorithm capacities and datasets to provide a signature for when a model can overfit or underfit a dataset. We present results upper-bounding algorithm capacity, establish its relationship to quantities in the algorithmic search framework for machine learning, and relate our work to recent information-theoretic approaches to generalization.

algorithm, algorithm capacity, dataset, (15 more...)

arXiv.org Artificial Intelligence

2010.06076

Country: North America > United States > California (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Information-Theoretic Perspective of Federated Learning

Adilova, Linara, Rosenzweig, Julia, Kamp, Michael

arXiv.org Machine LearningNov-15-2019

An approach to distributed machine learning is to train models on local datasets and aggregate these models into a single, stronger model. A popular instance of this form of parallelization is federated learning, where the nodes periodically send their local models to a coordinator that aggregates them and redistributes the aggregation back to continue training with it. The most frequently used form of aggregation is averaging the model parameters, e.g., the weights of a neural network. However, due to the non-convexity of the loss surface of neural networks, averaging can lead to detrimental effects and it remains an open question under which conditions averaging is beneficial. In this paper, we study this problem from the perspective of information theory: We measure the mutual information between representation and inputs as well as representation and labels in local models and compare it to the respective information contained in the representation of the averaged model. Our empirical results confirm previous observations about the practical usefulness of averaging for neural networks, even if local dataset distributions vary strongly. Furthermore, we obtain more insights about the impact of the aggregation frequency on the information flow and thus on the success of distributed learning. These insights will be helpful both in improving the current synchronization process and in further understanding the effects of model aggregation.

dataset, information, mutual information, (16 more...)

arXiv.org Machine Learning

1911.07652

Country:

Europe > Germany (0.15)
Oceania > Australia (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Learning to Explain: An Information-Theoretic Perspective on Model Interpretation

Chen, Jianbo, Song, Le, Wainwright, Martin J., Jordan, Michael I.

arXiv.org Artificial IntelligenceFeb-21-2018

Interpretability is an extremely important criterion when a machine learning model is applied in areas such as medicine, financial markets, and criminal justice (e.g., see the discussion paper by Lipton ([18]), as well as references therein). Many complex models, such as random forests, kernel methods, and deep neural networks, have been developed and employed to optimize prediction accuracy, which can compromise their ease of interpretation. In this paper, we focus on instancewise feature selection as a specific approach for model interpretation. Given a machine learning model, instancewise feature selection asks for the importance scores of each feature on the prediction of a given instance, and the relative importance of each feature are allowed to vary across instances. Thus, the importance scores can act as an explanation for the specific instance, indicating which features are the key for the model to make its prediction on that instance.

artificial intelligence, information-theoretic perspective, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1802.07814

Genre: Research Report (0.50)

Industry:

Media > Film (0.46)
Law (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback