Goto

Collaborating Authors

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

arXiv.org Artificial Intelligence

Self-supervised learning (SSL) is rapidly closing the gap with supervised methods on large computer vision benchmarks. A successful approach to SSL is to learn representations which are invariant to distortions of the input sample. However, a recurring issue with this approach is the existence of trivial constant representations. Most current methods avoid such collapsed solutions by careful implementation details. We propose an objective function that naturally avoids such collapse by measuring the cross-correlation matrix between the outputs of two identical networks fed with distorted versions of a sample, and making it as close to the identity matrix as possible. This causes the representation vectors of distorted versions of a sample to be similar, while minimizing the redundancy between the components of these vectors. The method is called Barlow Twins, owing to neuroscientist H. Barlow's redundancy-reduction principle applied to a pair of identical networks. Barlow Twins does not require large batches nor asymmetry between the network twins such as a predictor network, gradient stopping, or a moving average on the weight updates. It allows the use of very high-dimensional output vectors. Barlow Twins outperforms previous methods on ImageNet for semi-supervised classification in the low-data regime, and is on par with current state of the art for ImageNet classification with a linear classifier head, and for transfer tasks of classification and object detection.


VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

arXiv.org Artificial Intelligence

Recent self-supervised methods for image representation learning are based on maximizing the agreement between embedding vectors from different views of the same image. A trivial solution is obtained when the encoder outputs constant vectors. This collapse problem is often avoided through implicit biases in the learning architecture, that often lack a clear justification or interpretation. In this paper, we introduce VICReg (Variance-Invariance-Covariance Regularization), a method that explicitly avoids the collapse problem with a simple regularization term on the variance of the embeddings along each dimension individually. VICReg combines the variance term with a decorrelation mechanism based on redundancy reduction and covariance regularization, and achieves results on par with the state of the art on several downstream tasks. In addition, we show that incorporating our new variance term into other methods helps stabilize the training and leads to performance improvements.


On Feature Decorrelation in Self-Supervised Learning

arXiv.org Artificial Intelligence

In self-supervised representation learning, a common idea behind most of the state-of-the-art approaches is to enforce the robustness of the representations to predefined augmentations. A potential issue of this idea is the existence of completely collapsed solutions (i.e., constant features), which are typically avoided implicitly by carefully chosen implementation details. In this work, we study a relatively concise framework containing the most common components from recent approaches. We verify the existence of complete collapse and discover another reachable collapse pattern that is usually overlooked, namely dimensional collapse. We connect dimensional collapse with strong correlations between axes and consider such connection as a strong motivation for feature decorrelation (i.e., standardizing the covariance matrix). The capability of correlation as an unsupervised metric and the gains from feature decorrelation are verified empirically to highlight the importance and the potential of this insight.


Graph Self Supervised Learning: the BT, the HSIC, and the VICReg

arXiv.org Machine Learning

Self-supervised learning and pre-training strategies have developed over the last few years especially for Convolutional Neural Networks (CNNs). Recently application of such methods can also be noticed for Graph Neural Networks (GNNs). In this paper, we have used a graph based self-supervised learning strategy with different loss functions (Barlow Twins[ 7], HSIC[ 4], VICReg[ 1]) which have shown promising results when applied with CNNs previously. We have also proposed a hybrid loss function combining the advantages of VICReg and HSIC and called it as VICRegHSIC. The performance of these aforementioned methods have been compared when applied to two different datasets namely MUTAG and PROTEINS. Moreover, the impact of different batch sizes, projector dimensions and data augmentation strategies have also been explored. The results are preliminary and we will be continuing to explore with other datasets.


Supervised Dimensionality Reduction and Visualization using Centroid-encoder

arXiv.org Machine Learning

Visualizing high-dimensional data is an essential task in Data Science and Machine Learning. The Centroid-Encoder (CE) method is similar to the autoencoder but incorporates label information to keep objects of a class close together in the reduced visualization space. CE exploits nonlinearity and labels to encode high variance in low dimensions while capturing the global structure of the data. We present a detailed analysis of the method using a wide variety of data sets and compare it with other supervised dimension reduction techniques, including NCA, nonlinear NCA, t-distributed NCA, t-distributed MCML, supervised UMAP, supervised PCA, Colored Maximum Variance Unfolding, supervised Isomap, Parametric Embedding, supervised Neighbor Retrieval Visualizer, and Multiple Relational Embedding. We empirically show that centroid-encoder outperforms most of these techniques. We also show that when the data variance is spread across multiple modalities, centroid-encoder extracts a significant amount of information from the data in low dimensional space. This key feature establishes its value to use it as a tool for data visualization.