AITopics | mi estimate

Collaborating Authors

mi estimate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mutual information and task-relevant latent dimensionality

Gulati, Paarth, Abdelaleem, Eslam, Sederberg, Audrey, Nemenman, Ilya

arXiv.org Machine LearningFeb-10-2026

Estimating the dimensionality of the latent representation needed for prediction-- the task-relevant dimension--is a difficult, largely unsolved problem with broad scientific applications. We cast it as an Information Bottleneck question: what embedding bottleneck dimension is sufficient to compress predictor and predicted views while preserving their mutual information (MI). We show that standard neural estimators with separable/bilinear critics systematically inflate the inferred dimension, and we address this by introducing a hybrid critic that retains an explicit dimensional bottleneck while allowing flexible nonlinear cross-view interactions, thereby preserving the latent geometry. We further propose a one-shot protocol that reads off the effective dimension from a single over-parameterized hybrid model, without sweeping over bottleneck sizes. We validate the approach on synthetic problems with known task-relevant dimension. We extend the approach to intrinsic dimensionality by constructing paired views of a single dataset, enabling comparison with classical geometric dimension estimators. In noisy regimes where those estimators degrade, our approach remains reliable. Finally, we demonstrate the utility of the method on multiple physics datasets. Before "low-dimensional latent embeddings" became a rallying cry of AI, they were already a basic aim of science: identify a low-dimensional state--a small set of degrees of freedom constructed from observations--that suffices to predict the quantities of interest. The long road from Aristotelian to Newtonian mechanics illustrates that determining the number of such state variables--the relevant latent dimensionality--can be hard, even before one argues about the right variables or the laws that relate them.

artificial intelligence, dimensionality, machine learning, (16 more...)

arXiv.org Machine Learning

2602.08105

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science (0.93)

Add feedback

Accurate Estimation of Mutual Information in High Dimensional Data

Abdelaleem, Eslam, Martini, K. Michael, Nemenman, Ilya

arXiv.org Machine LearningJun-3-2025

Mutual information (MI) is a measure of statistical dependencies between two variables, widely used in data analysis. Thus, accurate methods for estimating MI from empirical data are crucial. Such estimation is a hard problem, and there are provably no estimators that are universally good for finite datasets. Common estimators struggle with high-dimensional data, which is a staple of modern experiments. Recently, promising machine learning-based MI estimation methods have emerged. Yet it remains unclear if and when they produce accurate results, depending on dataset sizes, statistical structure of the data, and hyperparameters of the estimators, such as the embedding dimensionality or the duration of training. There are also no accepted tests to signal when the estimators are inaccurate. Here, we systematically explore these gaps. We propose and validate a protocol for MI estimation that includes explicit checks ensuring reliability and statistical consistency. Contrary to accepted wisdom, we demonstrate that reliable MI estimation is achievable even with severely undersampled, high-dimensional datasets, provided these data admit accurate low-dimensional representations. These findings broaden the potential use of machine learning-based MI estimation methods in real-world data analysis and provide new insights into when and why modern high-dimensional, self-supervised algorithms perform effectively.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2506.0033

Country:

Oceania > Australia (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Genre:

Research Report (1.00)
Workflow (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Understanding Probe Behaviors through Variational Bounds of Mutual Information

Choi, Kwanghee, Jung, Jee-weon, Watanabe, Shinji

arXiv.org Artificial IntelligenceDec-15-2023

With the success of self-supervised representations, researchers seek a better understanding of the information encapsulated within a representation. Among various interpretability methods, we focus on classification-based linear probing. We aim to foster a solid understanding and provide guidelines for linear probing by constructing a novel mathematical framework leveraging information theory. First, we connect probing with the variational bounds of mutual information (MI) to relax the probe design, equating linear probing with fine-tuning. Then, we investigate empirical behaviors and practices of probing through our mathematical framework. We analyze the layer-wise performance curve being convex, which seemingly violates the data processing inequality. However, we show that the intermediate representations can have the biggest MI estimate because of the tradeoff between better separability and decreasing MI. We further suggest that the margin of linearly separable representations can be a criterion for measuring the "goodness of representation." We also compare accuracy with MI as the measuring criteria. Finally, we empirically validate our claims by observing the self-supervised speech models on retaining word and phoneme information.

information, mi estimate, representation, (16 more...)

arXiv.org Artificial Intelligence

2312.10019

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

A robust estimator of mutual information for deep learning interpretability

Piras, Davide, Peiris, Hiranya V., Pontzen, Andrew, Lucie-Smith, Luisa, Guo, Ningyuan, Nord, Brian

arXiv.org Artificial IntelligenceMar-23-2023

We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning models. To accurately estimate MI from a finite number of samples, we present GMM-MI (pronounced $``$Jimmie$"$), an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficient, robust to the choice of hyperparameters and provides the uncertainty on the MI estimate due to the finite sample size. We extensively validate GMM-MI on toy data for which the ground truth MI is known, comparing its performance against established mutual information estimators. We then demonstrate the use of our MI estimator in the context of representation learning, working with synthetic data and physical datasets describing highly non-linear processes. We train deep learning models to encode high-dimensional data within a meaningful compressed (latent) representation, and use GMM-MI to quantify both the level of disentanglement between the latent variables, and their association with relevant physical quantities, thus unlocking the interpretability of the latent representation. We make GMM-MI publicly available.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2632-2153/acc444

2211.00024

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(10 more...)

Genre: Research Report (0.82)

Industry: Government > Regional Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback