Inductive Learning
A Supervised Learning Approach to Rankability
McJames, Nathan, Malone, David, Mason, Oliver
The rankability of data is a recently proposed problem that considers the ability of a dataset, represented as a graph, to produce a meaningful ranking of the items it contains. To study this concept, a number of rankability measures have recently been proposed, based on comparisons to a complete dominance graph via combinatorial and linear algebraic methods. In this paper, we review these measures and highlight some questions to which they give rise before going on to propose new methods to assess rankability, which are amenable to efficient estimation. Finally, we compare these measures by applying them to both synthetic and real-life sports data.
Semi-supervised Learning on Large Graphs: is Poisson Learning a Game-Changer?
We explain Poisson learning on graph-based semi-supervised learning to see if it could avoid the problem of global information loss problem as Laplace-based learning methods on large graphs. From our analysis, Poisson learning is simply Laplace regularization with thresholding, cannot overcome the problem.
Meta's Yann LeCun on his vision for human-level AI
This article is part of our coverage of the latest in AI research. What is the next step toward bridging the gap between natural and artificial intelligence? Scientists and researchers are divided on the answer. Yann LeCun, Chief AI Scientist at Meta and the recipient of the 2018 Turing Award, is betting on self-supervised learning, machine learning models that can be trained without the need for human-labeled examples. LeCun has been thinking and talking about self-supervised and unsupervised learning for years. But as his research and the fields of AI and neuroscience have progressed, his vision has converged around several promising concepts and trends.
A Review of Emerging Research Directions in Abstract Visual Reasoning
Małkiński, Mikołaj, Mańdziuk, Jacek
Abstract--Abstract Visual Reasoning (AVR) problems are commonly used to approximate human intelligence. They test the ability of applying previously gained knowledge, experience and skills in a completely new setting, which makes them particularly well-suited for this task. Recently, the AVR problems have become popular as a proxy to study machine intelligence, which has led to emergence of new distinct types of problems and multiple benchmark sets. In this work we review this emerging AVR research and propose a taxonomy to categorise the AVR tasks along 5 dimensions: input shapes, hidden rules, target task, cognitive function, and specific challenge. The perspective taken in this survey allows to characterise AVR problems with respect to their shared and distinct properties, provides a unified view on the existing approaches to solving AVR tasks, shows how the AVR problems relate to practical applications, and outlines promising directions for future work. One of them refers to the observation that in the machine learning literature different tasks are considered in isolation, which is in the stark contrast with the way the AVR tasks are used to measure human intelligence, where multiple types of problems are combined within a single IQ test.
Resolving label uncertainty with implicit posterior models
We propose a method for jointly inferring labels across a collection of data samples, where each sample consists of an observation and a prior belief about the label. By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs. This formulation unifies various machine learning settings; the weak beliefs can come in the form of noisy or incomplete labels, likelihoods given by a different prediction mechanism on auxiliary input, or common-sense priors reflecting knowledge about the structure of the problem at hand. We demonstrate the proposed algorithms on diverse problems: classification with negative training examples, learning from rankings, weakly and self-supervised aerial imagery segmentation, co-segmentation of video frames, and coarsely supervised text classification.
An Information-theoretical Approach to Semi-supervised Learning under Covariate-shift
Aminian, Gholamali, Abroshan, Mahed, Khalili, Mohammad Mahdi, Toni, Laura, Rodrigues, Miguel R. D.
A common assumption in semi-supervised learning is that the labeled, unlabeled, and test data are drawn from the same distribution. However, this assumption is not satisfied in many applications. In many scenarios, the data is collected sequentially (e.g., healthcare) and the distribution of the data may change over time often exhibiting so-called covariate shifts. In this paper, we propose an approach for semi-supervised learning algorithms that is capable of addressing this issue. Our framework also recovers some popular methods, including entropy minimization and pseudo-labeling. We provide new information-theoretical based generalization error upper bounds inspired by our novel framework. Our bounds are applicable to both general semi-supervised learning and the covariate-shift scenario. Finally, we show numerically that our method outperforms previous approaches proposed for semi-supervised learning under the covariate shift.
Learning for Structured Prediction
Structured prediction is the main term for supervised machine learning techniques. Those techniques are involved predicting structured objects, instead of scalar discrete or real values. Structured prediction models are normally trained by means of observed data. In which the true value is used to regulate model parameters similar to usually used supervised learning techniques. The process of prediction using a trained model and of training the aforementioned is frequently computationally infeasible.
Masked prediction tasks: a parameter identifiability view
Liu, Bingbin, Hsu, Daniel, Ravikumar, Pradeep, Risteski, Andrej
The vast majority of work in self-supervised learning, both theoretical and empirical (though mostly the latter), have largely focused on recovering good features for downstream tasks, with the definition of "good" often being intricately tied to the downstream task itself. This lens is undoubtedly very interesting, but suffers from the problem that there isn't a "canonical" set of downstream tasks to focus on -- in practice, this problem is usually resolved by competing on the benchmark dataset du jour. In this paper, we present an alternative lens: one of parameter identifiability. More precisely, we consider data coming from a parametric probabilistic model, and train a self-supervised learning predictor with a suitably chosen parametric form. Then, we ask whether we can read off the ground truth parameters of the probabilistic model from the optimal predictor. We focus on the widely used self-supervised learning method of predicting masked tokens, which is popular for both natural languages and visual data. While incarnations of this approach have already been successfully used for simpler probabilistic models (e.g. learning fully-observed undirected graphical models), we focus instead on latent-variable models capturing sequential structures -- namely Hidden Markov Models with both discrete and conditionally Gaussian observations. We show that there is a rich landscape of possibilities, out of which some prediction tasks yield identifiability, while others do not. Our results, borne of a theoretical grounding of self-supervised learning, could thus potentially beneficially inform practice. Moreover, we uncover close connections with uniqueness of tensor rank decompositions -- a widely used tool in studying identifiability through the lens of the method of moments.
Graph Self-supervised Learning with Accurate Discrepancy Learning
Kim, Dongki, Baek, Jinheon, Hwang, Sung Ju
Self-supervised learning of graph neural networks (GNNs) aims to learn an accurate representation of the graphs in an unsupervised manner, to obtain transferable representations of them for diverse downstream tasks. Predictive learning and contrastive learning are the two most prevalent approaches for graph self-supervised learning. However, they have their own drawbacks. While the predictive learning methods can learn the contextual relationships between neighboring nodes and edges, they cannot learn global graph-level similarities. Contrastive learning, while it can learn global graph-level similarities, its objective to maximize the similarity between two differently perturbed graphs may result in representations that cannot discriminate two similar graphs with different properties. To tackle such limitations, we propose a framework that aims to learn the exact discrepancy between the original and the perturbed graphs, coined as Discrepancy-based Self-supervised LeArning (D-SLA). Specifically, we create multiple perturbations of the given graph with varying degrees of similarity and train the model to predict whether each graph is the original graph or a perturbed one. Moreover, we further aim to accurately capture the amount of discrepancy for each perturbed graph using the graph edit distance. We validate our method on various graph-related downstream tasks, including molecular property prediction, protein function prediction, and link prediction tasks, on which our model largely outperforms relevant baselines.
Phase Aberration Robust Beamformer for Planewave US Using Self-Supervised Learning
Khan, Shujaat, Huh, Jaeyoung, Ye, Jong Chul
Ultrasound (US) is widely used for clinical imaging applications thanks to its real-time and non-invasive nature. However, its lesion detectability is often limited in many applications due to the phase aberration artefact caused by variations in the speed of sound (SoS) within body parts. To address this, here we propose a novel self-supervised 3D CNN that enables phase aberration robust plane-wave imaging. Instead of aiming at estimating the SoS distribution as in conventional methods, our approach is unique in that the network is trained in a self-supervised manner to robustly generate a high-quality image from various phase aberrated images by modeling the variation in the speed of sound as stochastic. Experimental results using real measurements from tissue-mimicking phantom and \textit{in vivo} scans confirmed that the proposed method can significantly reduce the phase aberration artifacts and improve the visual quality of deep scans.