Inductive Learning
Backdoor Attacks on Federated Meta-Learning
Chen, Chien-Lun, Golubchik, Leana, Paolieri, Marco
Federated learning allows multiple users to collaboratively train a shared classification model while preserving data privacy. This approach, where model updates are aggregated by a central server, was shown to be vulnerable to backdoor attacks: a malicious user can alter the shared model to arbitrarily classify specific inputs from a given class. In this paper, we analyze the effects of backdoor attacks in federated meta-learning, where users train a model that can be adapted to different sets of output classes using only a few training examples. While the ability to adapt could, in principle, make federated learning more robust to backdoor attacks when new training examples are benign, we find that even 1-shot poisoning attacks can be very successful and persist after additional training. To address these vulnerabilities, we propose a defense mechanism inspired by matching networks, where the class of an input is predicted from the cosine similarity of its features with a support set of labeled examples. By removing the decision logic from the model shared with the federation, success and persistence of backdoor attacks are greatly reduced.
Learning to Play by Imitating Humans
Dinyari, Rostam, Sermanet, Pierre, Lynch, Corey
Acquiring multiple skills has commonly involved collecting a large number of expert demonstrations per task or engineering custom reward functions. Recently it has been shown that it is possible to acquire a diverse set of skills by self-supervising control on top of human teleoperated play data. Play is rich in state space coverage and a policy trained on this data can generalize to specific tasks at test time outperforming policies trained on individual expert task demonstrations. In this work, we explore the question of whether robots can learn to play to autonomously generate play data that can ultimately enhance performance. By training a behavioral cloning policy on a relatively small quantity of human play, we autonomously generate a large quantity of cloned play data that can be used as additional training. We demonstrate that a general purpose goal-conditioned policy trained on this augmented dataset substantially outperforms one trained only with the original human data on 18 difficult user-specified manipulation tasks in a simulated robotic tabletop environment. A video example of a robot imitating human play can be seen here: https://learning-to-play.github.io/videos/undirected_play1.mp4
LSSL: Longitudinal Self-Supervised Learning
Zhao, Qingyu, Liu, Zixuan, Adeli, Ehsan, Pohl, Kilian M.
Longitudinal neuroimaging or biomedical studies often acquire multiple observations from each individual over time, which entails repeated measures with highly interdependent variables. In this paper, we discuss the implication of repeated measures design on unsupervised learning by showing its tight conceptual connection to self-supervised learning and factor disentanglement. Leveraging the ability for `self-comparison' through repeated measures, we explicitly separate the definition of the factor space and the representation space enabling an exact disentanglement of time-related factors from the representations of the images. By formulating deterministic multivariate mapping functions between the two spaces, our model, named Longitudinal Self-Supervised Learning (LSSL), uses a standard autoencoding structure with a cosine loss to estimate the direction linked to the disentangled factor. We apply LSSL to two longitudinal neuroimaging studies to show its unique advantage in extracting the `brain-age' information from the data and in revealing informative characteristics associated with neurodegenerative and neuropsychological disorders. For a downstream task of supervised diagnosis classification, the representations learned by LSSL permit faster convergence and higher (or similar) prediction accuracy compared to several other representation learning techniques.
List Learning with Attribute Noise
Cheraghchi, Mahdi, Grigorescu, Elena, Juba, Brendan, Wimmer, Karl, Xie, Ning
We introduce and study the model of list learning with attribute noise. Learning with attribute noise was introduced by Shackelford and Volper (COLT 1988) as a variant of PAC learning, in which the algorithm has access to noisy examples and uncorrupted labels, and the goal is to recover an accurate hypothesis. Sloan (COLT 1988) and Goldman and Sloan (Algorithmica 1995) discovered information-theoretic limits to learning in this model, which have impeded further progress. In this article we extend the model to that of list learning, drawing inspiration from the list-decoding model in coding theory, and its recent variant studied in the context of learning. On the positive side, we show that sparse conjunctions can be efficiently list learned under some assumptions on the underlying ground-truth distribution. On the negative side, our results show that even in the list-learning model, efficient learning of parities and majorities is not possible regardless of the representation used.
Learning compositional models of robot skills for task and motion planning
Wang, Zi, Garrett, Caelan Reed, Kaelbling, Leslie Pack, Lozano-Pérez, Tomás
The objective of this work is to augment the basic abilities of a robot by learning to use new sensorimotor primitives to solve complex long-horizon manipulation problems. This requires flexible generative planning that can combine primitive abilities in novel combinations and thus generalize across a wide variety of problems. In order to plan with primitive actions, we must have models of the preconditions and effects of those actions: under what circumstances will executing this primitive successfully achieve some particular effect in the world? We use, and develop novel improvements on, state-of-the-art methods for active learning and sampling. We use Gaussian process methods for learning the conditions of operator effectiveness from small numbers of expensive training examples. We develop adaptive sampling methods for generating a comprehensive and diverse sequence of continuous parameter values (such as pouring waypoints for a cup) configurations and during planning for solving a new task, so that a complete robot plan can be found as efficiently as possible. We demonstrate our approach in an integrated system, combining traditional robotics primitives with our newly learned models using an efficient robot task and motion planner. We evaluate our approach both in simulation and in the real world through measuring the quality of the selected pours and scoops. Finally, we apply our integrated system to a variety of long-horizon simulated and real-world manipulation problems.
Propositionalization and Embeddings: Two Sides of the Same Coin
Lavrač, Nada, Škrlj, Blaž, Robnik-Šikonja, Marko
Data preprocessing is an important component of machine learning pipelines, which requires ample time and resources. An integral part of preprocessing is data transformation into the format required by a given learning algorithm. This paper outlines some of the modern data processing techniques used in relational learning that enable data fusion from different input data types and formats into a single table data representation, focusing on the propositionalization and embedding data transformation approaches. While both approaches aim at transforming data into tabular data format, they use different terminology and task definitions, are perceived to address different goals, and are used in different contexts. This paper contributes a unifying framework that allows for improved understanding of these two data transformation techniques by presenting their unified definitions, and by explaining the similarities and differences between the two approaches as variants of a unified complex data transformation task. In addition to the unifying framework, the novelty of this paper is a unifying methodology combining propositionalization and embeddings, which benefits from the advantages of both in solving complex data transformation and learning tasks. We present two efficient implementations of the unifying methodology: an instance-based PropDRM approach, and a feature-based PropStar approach to data transformation and learning, together with their empirical evaluation on several relational problems. The results show that the new algorithms can outperform existing relational learners and can solve much larger problems.
Optimally Combining Classifiers for Semi-Supervised Learning
Wang, Zhiguo, Yang, Liusha, Yin, Feng, Lin, Ke, Shi, Qingjiang, Luo, Zhi-Quan
This paper considers semi-supervised learning for tabular data. It is widely known that Xgboost based on tree model works well on the heterogeneous features while transductive support vector machine can exploit the low density separation assumption. However, little work has been done to combine them together for the end-to-end semi-supervised learning. In this paper, we find these two methods have complementary properties and larger diversity, which motivates us to propose a new semi-supervised learning method that is able to adaptively combine the strengths of Xgboost and transductive support vector machine. Instead of the majority vote rule, an optimization problem in terms of ensemble weight is established, which helps to obtain more accurate pseudo labels for unlabeled data. The experimental results on the UCI data sets and real commercial data set demonstrate the superior classification performance of our method over the five state-of-the-art algorithms improving test accuracy by about $3\%-4\%$. The partial code can be found at https://github.com/hav-cam-mit/CTO.
On Mutual Information in Contrastive Learning for Visual Representations
Wu, Mike, Zhuang, Chengxu, Mosse, Milan, Yamins, Daniel, Goodman, Noah
In recent years, several unsupervised, "contrastive" learning algorithms in vision have been shown to learn representations that perform remarkably well on transfer tasks. We show that this family of algorithms maximizes a lower bound on the mutual information between two or more "views" of an image where typical views come from a composition of image augmentations. Our bound generalizes the InfoNCE objective to support negative sampling from a restricted region of "difficult" contrasts. We find that the choice of negative samples and views are critical to the success of these algorithms. Reformulating previous learning objectives in terms of mutual information also simplifies and stabilizes them. In practice, our new objectives yield representations that outperform those learned with previous approaches for transfer to classification, bounding box detection, instance segmentation, and keypoint detection. The mutual information framework provides a unifying comparison of approaches to contrastive learning and uncovers the choices that impact representation learning.
Analyzing Differentiable Fuzzy Implications
van Krieken, Emile, Acar, Erman, van Harmelen, Frank
Combining symbolic and neural approaches has gained considerable attention in the AI community, as it is often argued that the strengths and weaknesses of these approaches are complementary. One such trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, they use prior background knowledge described in such logics to help the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks (or grounding them), this background knowledge can be added to regular loss functions, hence making reasoning a part of learning. In this paper, we investigate how implications from the fuzzy logic literature behave in a differentiable setting. In such a setting, we analyze the differences between the formal properties of these fuzzy implications. It turns out that various fuzzy implications, including some of the most well-known, are highly unsuitable for use in a differentiable learning setting. A further finding shows a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Furthermore, we introduce a new family of fuzzy implications (called sigmoidal implications) to tackle this phenomenon. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning, and show that sigmoidal implications outperform other choices of fuzzy implications.
Rates of Convergence for Laplacian Semi-Supervised Learning with Low Labeling Rates
Calder, Jeff, Slepčev, Dejan, Thorpe, Matthew
We study graph-based Laplacian semi-supervised learning at low labeling rates. Laplacian learning uses harmonic extension on a graph to propagate labels. At very low label rates, Laplacian learning becomes degenerate and the solution is roughly constant with spikes at each labeled data point. Previous work has shown that this degeneracy occurs when the number of labeled data points is finite while the number of unlabeled data points tends to infinity. In this work we allow the number of labeled data points to grow to infinity with the number of labels. Our results show that for a random geometric graph with length scale $\varepsilon>0$ and labeling rate $\beta>0$, if $\beta \ll\varepsilon^2$ then the solution becomes degenerate and spikes form, and if $\beta\gg \varepsilon^2$ then Laplacian learning is well-posed and consistent with a continuum Laplace equation. Furthermore, in the well-posed setting we prove quantitative error estimates of $O(\varepsilon\beta^{-1/2})$ for the difference between the solutions of the discrete problem and continuum PDE, up to logarithmic factors. We also study $p$-Laplacian regularization and show the same degeneracy result when $\beta \ll \varepsilon^p$. The proofs of our well-posedness results use the random walk interpretation of Laplacian learning and PDE arguments, while the proofs of the ill-posedness results use $\Gamma$-convergence tools from the calculus of variations. We also present numerical results on synthetic and real data to illustrate our results.