On the Optimal Representation Efficiency of Barlow Twins: An Information-Geometric Interpretation

Zhang, Di

arXiv.org Machine Learning 

Self-supervised learning (SSL) has emerged as a dominant paradigm for learning representations from unlabeled data [5]. Among various SSL approaches, methods based on redundancy reduction, such as Barlow Twins [7], have demonstrated exceptional performance. These methods operate on the principle of making the cross-correlation matrix between two distorted views of the data close to the identity matrix. While empirically successful, a deep theoretical explanation of why this objective leads to high-quality representations is still developing. A key desirable property of a good representation space is efficiency--the degree to which it utilizes its available dimensions to capture semantically meaningful, non-redundant information. An inefficient representation might suffer from dimensional collapse [4], where many dimensions are redundant or encode correlated information, limiting the representation's expressivity and suitability for downstream tasks. In this paper, we address this gap by proposing a novel information-geometric framework [1] for quantifying representation efficiency. Our core contributions are threefold: 1. We formally define the statistical manifold of representations and introduce a measure of representation efficiency η based on the spectrum of the average Fisher Information Matrix (FIM).