AITopics | information imbalance

Collaborating Authors

information imbalance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A quantitative analysis of semantic information in deep representations of text and images

Acevedo, Santiago, Mascaretti, Andrea, Rende, Riccardo, Mahaut, Matéo, Baroni, Marco, Laio, Alessandro

arXiv.org Artificial IntelligenceDec-8-2025

Deep neural networks are known to develop similar representations for semantically related data, even when they belong to different domains, such as an image and its description, or the same text in different languages. We present a method for quantitatively investigating this phenomenon by measuring the relative information content of the representations of semantically related data and probing how it is encoded into multiple tokens of large language models (LLMs) and vision transformers. Looking first at how LLMs process pairs of translated sentences, we identify inner ``semantic'' layers containing the most language-transferable information. We find moreover that, on these layers, a larger LLM (DeepSeek-V3) extracts significantly more general information than a smaller one (Llama3.1-8B). Semantic information of English text is spread across many tokens and it is characterized by long-distance correlations between tokens and by a causal left-to-right (i.e., past-future) asymmetry. We also identify layers encoding semantic information within visual transformers. We show that caption representations in the semantic layers of LLMs predict visual representations of the corresponding images. We observe significant and model-dependent information asymmetries between image and text representations.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.17101

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.40)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Effect of Label Noise on the Information Content of Neural Representations

Umar, Ali Hussaini, Tezoh, Franky Kevin Nando, Barbier, Jean, Acevedo, Santiago, Laio, Alessandro

arXiv.org Machine LearningOct-9-2025

In supervised classification tasks, models are trained to predict a label for each data point. In real-world datasets, these labels are often noisy due to annotation errors. While the impact of label noise on the performance of deep learning models has been widely studied, its effects on the networks' hidden representations remain poorly understood. We address this gap by systematically comparing hidden representations using the Information Imbalance, a computationally efficient proxy of conditional mutual information. Through this analysis, we observe that the information content of the hidden representations follows a double descent as a function of the number of network parameters, akin to the behavior of the test error. We further demonstrate that in the underparameterized regime, representations learned with noisy labels are more informative than those learned with clean labels, while in the overparameterized regime, these representations are equally informative. Our results indicate that the representations of overparameterized networks are robust to label noise. We also found that the information imbalance between the penultimate and pre-softmax layers decreases with cross-entropy loss in the overparameterized regime. This offers a new perspective on understanding generalization in classification tasks. Extending our analysis to representations learned from random labels, we show that these perform worse than random features. This indicates that training on random labels drives networks much beyond lazy learning, as weights adapt to encode labels information.

information content, label noise, representation, (14 more...)

arXiv.org Machine Learning

2510.06401

Country:

Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Understanding Variational Autoencoders with Intrinsic Dimension and Information Imbalance

Camboulin, Charles, Doimo, Diego, Glielmo, Aldo

arXiv.org Machine LearningNov-4-2024

This work presents an analysis of the hidden representations of Variational Autoencoders (VAEs) using the Intrinsic Dimension (ID) and the Information Imbalance (II). We show that VAEs undergo a transition in behaviour once the bottleneck size is larger than the ID of the data, manifesting in a double hunchback ID profile and a qualitative shift in information processing as captured by the II. Our results also highlight two distinct training phases for architectures with sufficiently large bottleneck sizes, consisting of a rapid fit and a slower generalisation, as assessed by a differentiated behaviour of ID, II, and KL loss. These insights demonstrate that II and ID could be valuable tools for aiding architecture search, for diagnosing underfitting in VAEs, and, more broadly, they contribute to advancing a unified understanding of deep generative models through geometric analysis.

architecture, bottleneck size, transition, (10 more...)

arXiv.org Machine Learning

2411.01978

Country:

Europe > Italy (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

VPOcc: Exploiting Vanishing Point for Monocular 3D Semantic Occupancy Prediction

Kim, Junsu, Lee, Junhee, Shin, Ukcheol, Oh, Jean, Joo, Kyungdon

arXiv.org Artificial IntelligenceAug-7-2024

Monocular 3D semantic occupancy prediction is becoming important in robot vision due to the compactness of using a single RGB camera. However, existing methods often do not adequately account for camera perspective geometry, resulting in information imbalance along the depth range of the image. To address this issue, we propose a vanishing point (VP) guided monocular 3D semantic occupancy prediction framework named VPOcc. Our framework consists of three novel modules utilizing VP. First, in the VPZoomer module, we initially utilize VP in feature extraction to achieve information balanced feature extraction across the scene by generating a zoom-in image based on VP. Second, we perform perspective geometry-aware feature aggregation by sampling points towards VP using a VP-guided cross-attention (VPCA) module. Finally, we create an information-balanced feature volume by effectively fusing original and zoom-in voxel feature volumes with a balanced feature volume fusion (BVFV) module. Experiments demonstrate that our method achieves state-of-the-art performance for both IoU and mIoU on SemanticKITTI and SSCBench-KITTI360. These results are obtained by effectively addressing the information imbalance in images through the utilization of VP. Our code will be available at www.github.com/anonymous.

module, occupancy prediction, semantic occupancy prediction, (11 more...)

arXiv.org Artificial Intelligence

2408.03551

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > South Korea > Ulsan > Ulsan (0.04)

Genre: Research Report (0.40)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning

Schrodi, Simon, Hoffmann, David T., Argus, Max, Fischer, Volker, Brox, Thomas

arXiv.org Artificial IntelligenceApr-11-2024

Contrastive vision-language models like CLIP have gained popularity for their versatile applicable learned representations in various downstream tasks. Despite their successes in some tasks, like zero-shot image recognition, they also perform surprisingly poor on other tasks, like attribute detection. Previous work has attributed these challenges to the modality gap, a separation of image and text in the shared representation space, and a bias towards objects over other factors, such as attributes. In this work we investigate both phenomena. We find that only a few embedding dimensions drive the modality gap. Further, we propose a measure for object bias and find that object bias does not lead to worse performance on other concepts, such as attributes. But what leads to the emergence of the modality gap and object bias? To answer this question we carefully designed an experimental setting which allows us to control the amount of shared information between the modalities. This revealed that the driving factor behind both, the modality gap and the object bias, is the information imbalance between images and captions.

caption, information imbalance, modality gap, (14 more...)

arXiv.org Artificial Intelligence

2404.07983

Country:

Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Europe > Poland (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Tensor-reduced atomic density representations

Darby, James P., Kovács, Dávid P., Batatia, Ilyes, Caro, Miguel A., Hart, Gus L. W., Ortner, Christoph, Csányi, Gábor

arXiv.org Artificial IntelligenceDec-6-2022

Density based representations of atomic environments that are invariant under Euclidean symmetries have become a widely used tool in the machine learning of interatomic potentials, broader data-driven atomistic modelling and the visualisation and analysis of materials datasets.The standard mechanism used to incorporate chemical element information is to create separate densities for each element and form tensor products between them. This leads to a steep scaling in the size of the representation as the number of elements increases. Graph neural networks, which do not explicitly use density representations, escape this scaling by mapping the chemical element information into a fixed dimensional space in a learnable way. We recast this approach as tensor factorisation by exploiting the tensor structure of standard neighbour density based descriptors. In doing so, we form compact tensor-reduced representations whose size does not depend on the number of chemical elements, but remain systematically convergeable and are therefore applicable to a wide range of data analysis and regression tasks.

artificial intelligence, basis function, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1103/PhysRevLett.131.028001

2210.01705

Country:

North America > United States > Utah > Utah County > Provo (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Materials > Chemicals (0.76)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

DADApy: Distance-based Analysis of DAta-manifolds in Python

Glielmo, Aldo, Macocco, Iuri, Doimo, Diego, Carli, Matteo, Zeni, Claudio, Wild, Romina, d'Errico, Maria, Rodriguez, Alex, Laio, Alessandro

arXiv.org Artificial IntelligenceSep-19-2022

DADApy is a python software package for analysing and characterising high-dimensional data manifolds. It provides methods for estimating the intrinsic dimension and the probability density, for performing density-based clustering and for comparing different distance metrics. We review the main functionalities of the package and exemplify its usage in toy cases and in a real-world application. DADApy is freely available under the open-source Apache 2.0 license.

artificial intelligence, dadapy, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.patter.2022.100589

2205.03373

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Ranking the information content of distance measures

Glielmo, Aldo, Zeni, Claudio, Cheng, Bingqing, Csanyi, Gabor, Laio, Alessandro

arXiv.org Machine LearningApr-30-2021

Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Using the fewest features but still retaining sufficient information about the system is crucial in many statistical learning approaches, particularly when data are sparse. We introduce a statistical test that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This in turn allows finding the most informative distance measure out of a pool of candidates. The approach is applied to find the most relevant policy variables for controlling the Covid-19 epidemic and to find compact yet informative representations of atomic structures, but its potential applications are wide ranging in many branches of science.

distance measure, information imbalance, latexit sha1, (14 more...)

arXiv.org Machine Learning

2104.15079

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (0.60)
Health & Medicine > Epidemiology (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback