AITopics | Fleissner, Maximilian

Plotting

Fleissner, Maximilian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Probabilistic Model for Self-Supervised Learning

Fleissner, Maximilian, Esser, Pascal, Ghoshdastidar, Debarghya

arXiv.org Artificial IntelligenceJan-22-2025

Self-supervised learning (SSL) aims to find meaningful representations from unlabeled data by encoding semantic similarities through data augmentations. Despite its current popularity, theoretical insights about SSL are still scarce. For example, it is not yet known whether commonly used SSL loss functions can be related to a statistical model, much in the same as OLS, generalized linear models or PCA naturally emerge as maximum likelihood estimates of an underlying generative process. In this short paper, we consider a latent variable statistical model for SSL that exhibits an interesting property: Depending on the informativeness of the data augmentations, the MLE of the model either reduces to PCA, or approaches a simple non-contrastive loss. We analyze the model and also empirically illustrate our findings.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.13031

Country: Europe > Germany (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Data Augmentations Go Beyond Encoding Invariances: A Theoretical Study on Self-Supervised Learning

Feigin, Shlomo Libo, Fleissner, Maximilian, Ghoshdastidar, Debarghya

arXiv.org Artificial IntelligenceDec-15-2024

Understanding the role of data augmentations is critical for applying Self-Supervised Learning (SSL) methods in new domains. Data augmentations are commonly understood as encoding invariances into the learned representations. This interpretation suggests that SSL would require diverse augmentations that resemble the original data. However, in practice, augmentations do not need to be similar to the original data nor be diverse, and can be neither at the same time. We provide a theoretical insight into this phenomenon. We show that for different SSL losses, any non-redundant representation can be learned with a single suitable augmentation. We provide an algorithm to reconstruct such augmentations and give insights into augmentation choices in SSL.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.01767

Country:

Europe > Germany (0.14)
North America > United States (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)

Add feedback

Infinite Width Limits of Self Supervised Neural Networks

Fleissner, Maximilian, Anil, Gautham Govind, Ghoshdastidar, Debarghya

arXiv.org Artificial IntelligenceNov-17-2024

The NTK is a widely used tool in the theoretical analysis of deep learning, allowing us to look at supervised deep neural networks through the lenses of kernel regression. Recently, several works have investigated kernel models for self-supervised learning, hypothesizing that these also shed light on the behaviour of wide neural networks by virtue of the NTK. However, it remains an open question to what extent this connection is mathematically sound -- it is a commonly encountered misbelief that the kernel behaviour of wide neural networks emerges irrespective of the loss function it is trained on. In this paper, we bridge the gap between the NTK and self-supervised learning, focusing on two-layer neural networks trained under the Barlow Twins loss. We prove that the NTK of Barlow Twins indeed becomes constant as the width of the network approaches infinity. Our analysis technique is different from previous works on the NTK and may be of independent interest. Overall, our work provides a first rigorous justification for the use of classic kernel theory to understand self-supervised learning of wide neural networks. Building on this result, we derive generalization error bounds for kernelized Barlow Twins and connect them to neural networks of finite width.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2411.11176

Country: Europe (0.14)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Decision Trees for Interpretable Clusters in Mixture Models and Deep Representations

Fleissner, Maximilian, Zarvandi, Maedeh, Ghoshdastidar, Debarghya

arXiv.org Artificial IntelligenceNov-3-2024

Decision Trees are one of the backbones of explainable machine learning, and often serve as interpretable alternatives to black-box models. Traditionally utilized in the supervised setting, there has recently also been a surge of interest in decision trees for unsupervised learning. While several works with worst-case guarantees on the clustering cost have appeared, these results are distribution-agnostic, and do not give insight into when decision trees can actually recover the underlying distribution of the data (up to some small error). In this paper, we therefore introduce the notion of an explainability-to-noise ratio for mixture models, formalizing the intuition that well-clustered data can indeed be explained well using a decision tree. We propose an algorithm that takes as input a mixture model and constructs a suitable tree in data-independent time. Assuming sub-Gaussianity of the mixture components, we prove upper and lower bounds on the error rate of the resulting decision tree. In addition, we demonstrate how concept activation vectors can be used to extend explainable clustering to neural networks. We empirically demonstrate the efficacy of our approach on standard tabular and image datasets.

artificial intelligence, decision tree learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.01576

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Explaining Kernel Clustering via Decision Trees

Fleissner, Maximilian, Vankadara, Leena Chennuru, Ghoshdastidar, Debarghya

arXiv.org Artificial IntelligenceFeb-15-2024

Despite the growing popularity of explainable and interpretable machine learning, there is still surprisingly limited work on inherently interpretable clustering methods. Recently, there has been a surge of interest in explaining the classic k-means algorithm, leading to efficient algorithms that approximate k-means clusters using axis-aligned decision trees. However, interpretable variants of k-means have limited applicability in practice, where more flexible clustering methods are often needed to obtain useful partitions of the data. In this work, we investigate interpretable kernel clustering, and propose algorithms that construct decision trees to approximate the partitions induced by kernel k-means, a nonlinear extension of k-means. We further build on previous work on explainable k-means and demonstrate how a suitable choice of features allows preserving interpretability without sacrificing approximation guarantees on the interpretable model. The increasing predictive power of machine learning has made it a popular tool in many scientific fields. Sensitive applications such as healthcare or autonomous driving however require more than just good accuracy--it is also crucial for a model's decisions to be interpretable (Tjoa & Guan, 2020; Varshney & Alemzadeh, 2017). Unfortunately, popular machine learning models are not transparent and are often referred to as "black box" approaches. The demand for explainable machine learning has led to the development of several tools over the last few years, albeit mostly for supervised learning. Methods such as LIME or Shapley values (Ribeiro et al., 2016; Lundberg & Lee, 2017) are designed to explain the prediction of any given machine learning model.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2402.09881

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Transportation (0.54)
Health & Medicine (0.48)
Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)

Add feedback

Non-Parametric Representation Learning with Kernels

Esser, Pascal, Fleissner, Maximilian, Ghoshdastidar, Debarghya

arXiv.org Artificial IntelligenceSep-5-2023

Unsupervised and self-supervised representation learning has become popular in recent years for learning useful features from unlabelled data. Representation learning has been mostly developed in the neural network literature, and other models for representation learning are surprisingly unexplored. In this work, we introduce and analyze several kernel-based representation learning approaches: Firstly, we define two kernel Self-Supervised Learning (SSL) models using contrastive loss functions and secondly, a Kernel Autoencoder (AE) model based on the idea of embedding and reconstructing data. We argue that the classical representer theorems for supervised kernel machines are not always applicable for (self-supervised) representation learning, and present new representer theorems, which show that the representations learned by our kernel models can be expressed in terms of kernel matrices. We further derive generalisation error bounds for representation learning with kernel SSL and AE, and empirically evaluate the performance of these methods in both small data regimes as well as in comparison with neural network based models.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2309.02028

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback