Inductive Learning
Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs
Mercado, Pedro, Tudisco, Francesco, Hein, Matthias
We study the task of semi-supervised learning on multilayer graphs by taking into account both labeled and unlabeled observations together with the information encoded by each individual graph layer. We propose a regularizer based on the generalized matrix mean, which is a one-parameter family of matrix means that includes the arithmetic, geometric and harmonic means as particular cases. We analyze it in expectation under a Multilayer Stochastic Block Model and verify numerically that it outperforms state of the art methods. Moreover, we introduce a matrix-free numerical scheme based on contour integral quadratures and Krylov subspace solvers that scales to large sparse multilayer graphs.
Learning from Label Proportions with Consistency Regularization
Tsai, Kuen-Han, Lin, Hsuan-Tien
The problem of learning from label proportions (LLP) involves training classifiers with weak labels on bags of instances, rather than strong labels on individual instances. The weak labels only contain the label proportion of each bag. The LLP problem is important for many practical applications that only allow label proportions to be collected because of data privacy or annotation cost, and has recently received lots of research attention. Most existing works focus on extending supervised learning models to solve the LLP problem, but the weak learning nature makes it hard to further improve LLP performance with a supervised angle. In this paper, we take a different angle from semi-supervised learning. In particular, we propose a novel model inspired by consistency regularization, a popular concept in semi-supervised learning that encourages the model to produce a decision boundary that better describes the data manifold. With the introduction of consistency regularization, we further extend our study to non-uniform bag-generation and validation-based parameter-selection procedures that better match practical needs. Experiments not only justify that LLP with consistency regularization achieves superior performance, but also demonstrate the practical usability of the proposed procedures.
Distribution Density, Tails, and Outliers in Machine Learning: Metrics and Applications
Carlini, Nicholas, Erlingsson, รlfar, Papernot, Nicolas
We develop techniques to quantify the degree to which a given (training or testing) example is an outlier in the underlying distribution. We evaluate five methods to score examples in a dataset by how well-represented the examples are, for different plausible definitions of "well-represented", and apply these to four common datasets: MNIST, Fashion-MNIST, CIFAR-10, and ImageNet. Despite being independent approaches, we find all five are highly correlated, suggesting that the notion of being well-represented can be quantified. Among other uses, we find these methods can be combined to identify (a) prototypical examples (that match human expectations); (b) memorized training examples; and, (c) uncommon submodes of the dataset. Further, we show how we can utilize our metrics to determine an improved ordering for curriculum learning, and impact adversarial robustness. We release all metric values on training and test sets we studied.
Model enhancement and personalization using weakly supervised learning for multi-modal mobile sensing
Teng, Diyan, Kulkarni, Rashmi, McGloin, Justin
Always-on sensing of mobile device user's contextual information is critical to many intelligent use cases nowadays such as healthcare, drive assistance, voice UI. State-of-the-art approaches for predicting user context have proved the value to leverage multiple sensing modalities for better accuracy. However, those context inference algorithms that run on application processor nowadays tend to drain heavy amount of power, making them not suitable for an always-on implementation. We claim that not every sensing modality is suitable to be activated all the time and it remains challenging to build an inference engine using power friendly sensing modalities. Meanwhile, due to the diverse population, we find it challenging to learn a context inference model that generalizes well, with limited training data, especially when only using always-on low power sensors. In this work, we propose an approach to leverage the opportunistically-on counterparts in device to improve the always-on prediction model, leading to a personalized solution. We model this problem using a weakly supervised learning framework and provide both theoretical and experimental results to validate our design. The proposed framework achieves satisfying result in the IMU based activity recognition application we considered.
Gait Event Detection in Tibial Acceleration Profiles: a Structured Learning Approach
Robberechts, Pieter, Derie, Rud, Berghe, Pieter Van den, Gerlo, Joeri, De Clercq, Dirk, Segers, Veerle, Davis, Jesse
Analysis of runner's data will often examine gait variables with reference to one or more gait events. Two such representative events are the initial contact and toe off events. These correspond respectively to the moments in time when the foot makes the initial contact with the ground and when the foot leaves the ground again. These variables are traditionally measured with a force plate or motion capture system in a lab setting. However, thanks to recent evolutions in wearable technology, the use of accelerometers has become commonplace for prolonged outdoor measurements. Previous research has developed heuristic methods to identify the initial contact and toe off timings based on minima, maxima and thresholds in the acceleration profiles. A significant flaw of these heuristic-based methods is that they are tailored to very specific acceleration profiles, providing no guidelines on how to handle deviant profiles. Therefore, we frame the problem as a structured prediction task and propose a machine learning approach for determining initial foot contact and toe off events from 3D tibial acceleration profiles. With mean absolute errors of 2 ms and 4 ms for respectively the initial contact and toe-off events, our method significantly outperforms the existing heuristic approaches.
PRNet: Self-Supervised Learning for Partial-to-Partial Registration
We present a simple, flexible, and general framework titled Partial Registration Network (PRNet), for partial-to-partial point cloud registration. Inspired by recently-proposed learning-based methods for registration, we use deep networks to tackle non-convexity of the alignment and partial correspondence problems. While previous learning-based methods assume the entire shape is visible, PRNet is suitable for partial-to-partial registration, outperforming PointNetLK, DCP, and non-learning methods on synthetic data. PRNet is self-supervised, jointly learning an appropriate geometric representation, a keypoint detector that finds points in common between partial views, and keypoint-to-keypoint correspondences. We show PRNet predicts keypoints and correspondences consistently across views and objects. Furthermore, the learned representation is transferable to classification.
AMP0: Species-Specific Prediction of Anti-microbial Peptides using Zero and Few Shot Learning
The evolution of drug-resistant microbial species is one of the major challenges to global health. The development of new antimicrobial treatments such as antimicrobial peptides needs to be accelerated to combat this threat. However, the discovery of novel antimicrobial peptides is hampered by low-throughput biochemical assays. Computational techniques can be used for rapid screening of promising antimicrobial peptide candidates prior to testing in the wet lab. The vast majority of existing antimicrobial peptide predictors are non-targeted in nature, i.e., they can predict whether a given peptide sequence is antimicrobial, but they are unable to predict whether the sequence can target a particular microbial species. In this work, we have developed a targeted antimicrobial peptide activity predictor that can predict whether a peptide is effective against a given microbial species or not. This has been made possible through zero-shot and few-shot machine learning. The proposed predictor called AMP0 takes in the peptide amino acid sequence and any N/C-termini modifications together with the genomic sequence of a target microbial species to generate targeted predictions. It is important to note that the proposed method can generate predictions for species that are not part of its training set. The accuracy of predictions for novel test species can be further improved by providing a few example peptides for that species. Our computational cross-validation results show that the pro-posed scheme is particularly effective for targeted antimicrobial prediction in comparison to existing approaches and can be used for screening potential antimicrobial peptides in a targeted manner especially for cases in which the number of training examples is small. The webserver of the method is available at http://ampzero.pythonanywhere.com.
Learning Data Manipulation for Augmentation and Weighting
Hu, Zhiting, Tan, Bowen, Salakhutdinov, Ruslan, Mitchell, Tom, Xing, Eric P.
Manipulating data, such as weighting data examples or augmenting with new instances, has been increasingly used to improve model training. Previous work has studied various rule- or learning-based approaches designed for specific types of data manipulation. In this work, we propose a new method that supports learning different manipulation schemes with the same gradient-based algorithm. Our approach builds upon a recent connection of supervised learning and reinforcement learning (RL), and adapts an off-the-shelf reward learning algorithm from RL for joint data manipulation learning and model training. Different parameterization of the "data reward" function instantiates different manipulation schemes. We showcase data augmentation that learns a text transformation network, and data weighting that dynamically adapts the data sample importance. Experiments show the resulting algorithms significantly improve the image and text classification performance in low data regime and class-imbalance problems.
SUPER Learning: A Supervised-Unsupervised Framework for Low-Dose CT Image Reconstruction
Li, Zhipeng, Ye, Siqi, Long, Yong, Ravishankar, Saiprasad
Recent years have witnessed growing interest in machine learning-based models and techniques for low-dose X-ray CT (LDCT) imaging tasks. The methods can typically be categorized into supervised learning methods and unsupervised or model-based learning methods. Supervised learning methods have recently shown success in image restoration tasks. However, they often rely on large training sets. Model-based learning methods such as dictionary or transform learning do not require large or paired training sets and often have good generalization properties, since they learn general properties of CT image sets. Recent works have shown the promising reconstruction performance of methods such as PWLS-ULTRA that rely on clustering the underlying (reconstructed) image patches into a learned union of transforms. In this paper, we propose a new Supervised-UnsuPERvised (SUPER) reconstruction framework for LDCT image reconstruction that combines the benefits of supervised learning methods and (unsupervised) transform learning-based methods such as PWLS-ULTRA that involve highly image-adaptive clustering. The SUPER model consists of several layers, each of which includes a deep network learned in a supervised manner and an unsupervised iterative method that involves image-adaptive components. The SUPER reconstruction algorithms are learned in a greedy manner from training data. The proposed SUPER learning methods dramatically outperform both the constituent supervised learning-based networks and iterative algorithms for LDCT, and use much fewer iterations in the iterative reconstruction modules.
Classifying Rare Events Using Five Machine Learning Techniques
Supervised learning is the crown jewel of Machine Learning. Supervised learning is the machine learning task or process of producing a function that predicts output variables. It has been adopted widely in the industry. For example, banks apply supervised models to detect credit card fraud. Quantitative traders make purchase decisions based on ML model predictions.