Goto

Collaborating Authors

 Inductive Learning


Federated Self-Supervised Learning of Multi-Sensor Representations for Embedded Intelligence

arXiv.org Machine Learning

Smartphones, wearables, and Internet of Things (IoT) devices produce a wealth of data that cannot be accumulated in a centralized repository for learning supervised models due to privacy, bandwidth limitations, and the prohibitive cost of annotations. Federated learning provides a compelling framework for learning models from decentralized data, but conventionally, it assumes the availability of labeled samples, whereas on-device data are generally either unlabeled or cannot be annotated readily through user interaction. To address these issues, we propose a self-supervised approach termed \textit{scalogram-signal correspondence learning} based on wavelet transform to learn useful representations from unlabeled sensor inputs, such as electroencephalography, blood volume pulse, accelerometer, and WiFi channel state information. Our auxiliary task requires a deep temporal neural network to determine if a given pair of a signal and its complementary viewpoint (i.e., a scalogram generated with a wavelet transform) align with each other or not through optimizing a contrastive objective. We extensively assess the quality of learned features with our multi-view strategy on diverse public datasets, achieving strong performance in all domains. We demonstrate the effectiveness of representations learned from an unlabeled input collection on downstream tasks with training a linear classifier over pretrained network, usefulness in low-data regime, transfer learning, and cross-validation. Our methodology achieves competitive performance with fully-supervised networks, and it outperforms pre-training with autoencoders in both central and federated contexts. Notably, it improves the generalization in a semi-supervised setting as it reduces the volume of labeled data required through leveraging self-supervised learning.


Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

arXiv.org Artificial Intelligence

Intrinsically motivated spontaneous exploration is a key enabler of autonomous lifelong learning in human children. It enables the discovery and acquisition of large repertoires of skills through self-generation, self-selection, self-ordering and self-experimentation of learning goals. We present an algorithmic approach called Intrinsically Motivated Goal Exploration Processes (IMGEP) to enable similar properties of autonomous or self-supervised learning in machines. The IMGEP algorithmic architecture relies on several principles: 1) self-generation of goals, generalized as fitness functions; 2) selection of goals based on intrinsic rewards; 3) exploration with incremental goal-parameterized policy search and exploitation of the gathered data with a batch learning algorithm; 4) systematic reuse of information acquired when targeting a goal for improving towards other goals. We present a particularly efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a population-based policy and an object-centered modularity in goals and mutations. We provide several implementations of this architecture and demonstrate their ability to automatically generate a learning curriculum within several experimental setups including a real humanoid robot that can explore multiple spaces of goals with several hundred continuous dimensions. While no particular target goal is provided to the system, this curriculum allows the discovery of skills that act as stepping stone for learning more complex skills, e.g. nested tool use. We show that learning diverse spaces of goals with intrinsic motivations is more efficient for learning complex skills than only trying to directly learn these complex skills.


Grale: Designing Networks for Graph Learning

arXiv.org Machine Learning

How can we find the right graph for semi-supervised learning? In real world applications, the choice of which edges to use for computation is the first step in any graph learning process. Interestingly, there are often many types of similarity available to choose as the edges between nodes, and the choice of edges can drastically affect the performance of downstream semi-supervised learning systems. However, despite the importance of graph design, most of the literature assumes that the graph is static. In this work, we present Grale, a scalable method we have developed to address the problem of graph design for graphs with billions of nodes. Grale operates by fusing together different measures of(potentially weak) similarity to create a graph which exhibits high task-specific homophily between its nodes. Grale is designed for running on large datasets. We have deployed Grale in more than 20 different industrial settings at Google, including datasets which have tens of billions of nodes, and hundreds of trillions of potential edges to score. By employing locality sensitive hashing techniques,we greatly reduce the number of pairs that need to be scored, allowing us to learn a task specific model and build the associated nearest neighbor graph for such datasets in hours, rather than the days or even weeks that might be required otherwise. We illustrate this through a case study where we examine the application of Grale to an abuse classification problem on YouTube with hundreds of million of items. In this application, we find that Grale detects a large number of malicious actors on top of hard-coded rules and content classifiers, increasing the total recall by 89% over those approaches alone.


Evolving Multi-label Classification Rules by Exploiting High-order Label Correlation

arXiv.org Machine Learning

In multi-label classification tasks, each problem instance is associated with multiple classes simultaneously. In such settings, the correlation between labels contains valuable information that can be used to obtain more accurate classification models. The correlation between labels can be exploited at different levels such as capturing the pair-wise correlation or exploiting the higher-order correlations. Even though the high-order approach is more capable of modeling the correlation, it is computationally more demanding and has scalability issues. This paper aims at exploiting the high-order label correlation within subsets of labels using a supervised learning classifier system (UCS). For this purpose, the label powerset (LP) strategy is employed and a prediction aggregation within the set of the relevant labels to an unseen instance is utilized to increase the prediction capability of the LP method in the presence of unseen labelsets. Exact match ratio and Hamming loss measures are considered to evaluate the rule performance and the expected fitness value of a classifier is investigated for both metrics. Also, a computational complexity analysis is provided for the proposed algorithm. The experimental results of the proposed method are compared with other well-known LP-based methods on multiple benchmark datasets and confirm the competitive performance of this method.


Supervised learning on heterogeneous, attributed entities interacting over time

arXiv.org Machine Learning

Most physical or social phenomena can be represented by ontologies where the constituent entities are interacting in various ways with each other and with their environment. Furthermore, those entities are likely heterogeneous and attributed with features that evolve dynamically in time as a response to their successive interactions. In order to apply machine learning on such entities, e.g., for classification purposes, one therefore needs to integrate the interactions into the feature engineering in a systematic way. This proposal shows how, to this end, the current state of graph machine learning remains inadequate and needs to be be augmented with a comprehensive feature engineering paradigm in space and time.


MetAL: Active Semi-Supervised Learning on Graphs via Meta Learning

arXiv.org Machine Learning

The objective of active learning (AL) is to train classification models with less number of labeled instances by selecting only the most informative instances for labeling. The AL algorithms designed for other data types such as images and text do not perform well on graph-structured data. Although a few heuristics-based AL algorithms have been proposed for graphs, a principled approach is lacking. In this paper, we propose MetAL, an AL approach that selects unlabeled instances that directly improve the future performance of a classification model. For a semi-supervised learning problem, we formulate the AL task as a bilevel optimization problem. Based on recent work in meta-learning, we use the meta-gradients to approximate the impact of retraining the model with any unlabeled instance on the model performance. Using multiple graph datasets belonging to different domains, we demonstrate that MetAL efficiently outperforms existing state-of-the-art AL algorithms.


How does Artificial Intelligence Contribute to Robotic System Design?

#artificialintelligence

Artificial intelligence is en route to changing all industries and the robotics industry is not an exception. Presently, the innovative combination of AI and robotics has created a number of futuristic possibilities, in all the industry domains. While most of us will agree that most robots will be humanoids in 10 years from now; in many environments, robots are designed to emulate a range of behaviors and physical abilities will reflect a best fit for those characteristics. An exception will likely be robots that provide medical or other care or companionship for humans, and perhaps service robots that are meant to establish a more personal and'humanized' relationship. Though related, some would argue that the correct term is machine vision or robot vision rather than computer vision, because "robots seeing" involves more than just computer algorithms; engineers and roboticists also have to account for camera hardware that allow robots to process physical data.


What is important about the No Free Lunch theorems?

arXiv.org Machine Learning

The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution over problems at all. In particular, the theorems prove that {anti}-cross-validation (choosing among a set of candidate algorithms based on which has {worst} out-of-sample behavior) performs as well as cross-validation, unless one makes an assumption -- which has never been formalized -- about how the distribution over induction problems, on the one hand, is related to the set of algorithms one is choosing among using (anti-)cross validation, on the other. In addition, they establish strong caveats concerning the significance of the many results in the literature which establish the strength of a particular algorithm without assuming a particular distribution. They also motivate a ``dictionary'' between supervised learning and improve blackbox optimization, which allows one to ``translate'' techniques from supervised learning into the domain of blackbox optimization, thereby strengthening blackbox optimization algorithms. In addition to these topics, I also briefly discuss their implications for philosophy of science.


Self-supervised Learning: Generative or Contrastive

arXiv.org Machine Learning

Deep supervised learning has achieved great success in the last decade. However, its deficiencies of dependence on manual labels and vulnerability to attacks have driven people to explore a better solution. As an alternative, self-supervised learning attracts many researchers for its soaring performance on representation learning in the last several years. Self-supervised representation learning leverages input data itself as supervision and benefits almost all types of downstream tasks. In this survey, we take a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning. We comprehensively review the existing empirical methods and summarize them into three main categories according to their objectives: generative, contrastive, and generative-contrastive (adversarial). We further investigate related theoretical analysis work to provide deeper thoughts on how self-supervised learning works. Finally, we briefly discuss open problems and future directions for self-supervised learning. An outline slide for the survey is provided.


Mapping the world to help aid workers, with weakly, semi-supervised learning

#artificialintelligence

When disaster or disease strikes, relief agencies respond more effectively when they have detailed mapping tools to know exactly where to deliver assistance. But extremely reliable and precise maps often are not available. So, our team, composed of artificial intelligence researchers and data scientists in Facebook's Boston office, used our computer vision expertise to create and share population density maps that are more accurate and higher resolution than any of their predecessors. Building on our previous publication of similar high-resolution population maps for 22 countries, we're now releasing new maps of the majority of the African continent, and the project will eventually map nearly the whole world's population. When it is completed, humanitarian agencies will be able to determine how populations are distributed even in remote areas, so that health care workers can better reach households and relief workers can better distribute aid.