Performance Analysis
Matrix Variate Gaussian Mixture Distribution Steered Robust Metric Learning
Luo, Lei (University of Pittsburgh) | Huang, Heng (University of Pittsburgh)
Mahalanobis Metric Learning (MML) has been actively studied recently in machine learning community. Most of existing MML methods aim to learn a powerful Mahalanobis distance for computing similarity of two objects. More recently, multiple methods use matrix norm regularizers to constrain the learned distance matrixMto improve the performance. However, in real applications, the structure of the distance matrix M is complicated and cannot be characterized well by the simple matrix norm. In this paper, we propose a novel robust metric learning method with learning the structure of the distance matrix in a new and natural way. We partition M into blocks and consider each block as a random matrix variate, which is fitted by matrix variate Gaussian mixture distribution. Different from existing methods, our model has no any assumption on M and automatically learns the structure of M from the real data, where the distance matrix M often is neither sparse nor low-rank. We design an effective algorithm to optimize the proposed model and establish the corresponding theoretical guarantee. We conduct extensive evaluations on the real-world data. Experimental results show our method consistently outperforms the related state-of-the-art methods.
A Probabilistic Hierarchical Model for Multi-View and Multi-Feature Classification
Li, Jinxing (The Hong Kong Polytechnic University) | Yong, Hongwei (The Hong Kong Polytechnic University) | Zhang, Bob ( University of Macau ) | Li, Mu (The Hong Kong Polytechnic University) | Zhang, Lei (The Hong Kong Polytechnic University) | Zhang, David (The Hong Kong Polytechnic University)
Some recent works in classification show that the data obtained from various views with different sensors for an object contributes to achieving a remarkable performance. Actually, in many real-world applications, each view often contains multiple features, which means that this type of data has a hierarchical structure, while most of existing works do not take these features with multi-layer structure into consideration simultaneously. In this paper, a probabilistic hierarchical model is proposed to address this issue and applied for classification. In our model, a latent variable is first learned to fuse the multiple features obtained from a same view, sensor or modality. Particularly, mapping matrices corresponding to a certain view are estimated to project the latent variable from a shared space to the multiple observations. Since this method is designed for the supervised purpose, we assume that the latent variables associated with different views are influenced by their ground-truth label. In order to effectively solve the proposed method, the Expectation-Maximization (EM) algorithm is applied to estimate the parameters and latent variables. Experimental results on the extensive synthetic and two real-world datasets substantiate the effectiveness and superiority of our approach as compared with state-of-the-art.
Batchwise Patching of Classifiers
Kauschke, Sebastian (TU Darmstadt) | Fรผrnkranz, Johannes (TU Darmstadt)
In this work we present classifier patching, an approach for adapting an existing black-box classification model to new data. Instead of creating a new model, patching infers regions in the instance space where the existing model is error-prone by training a classifier on the previously misclassified data. It then learns a specific model to determine the error regions, which allows to patch the old modelโs predictions for them. Patching relies on a strong, albeit unchangeable, existing base classifier, and the idea that the true labels of seen instances will be available in batches at some point in time after the original classification. We experimentally evaluate our approach, and show that it meets the original design goals. Moreover, we compare our approach to existing methods from the domain of ensemble stream classification in both concept drift and transfer learning situations. Patching adapts quickly and achieves high classification accuracy, outperforming state-of-the-art competitors in either adaptation speed or accuracy in many scenarios.
Non-Discriminatory Machine Learning Through Convex Fairness Criteria
Goel, Naman (EPFL, Lausanne) | Yaghini, Mohammad (EPFL, Lausanne) | Faltings, Boi (EPFL, Lausanne)
Biased decision making by machine learning systems is increasingly recognized as an important issue. Recently, techniques have been proposed to learn non-discriminatory clas- sifiers by enforcing constraints in the training phase. Such constraints are either non-convex in nature (posing computational difficulties) or donโt have a clear probabilistic interpretation. Moreover, the techniques offer little understanding of the more subjective notion of fairness. In this paper, we introduce a novel technique to achieve non-discrimination without sacrificing convexity and probabilistic interpretation. Our experimental analysis demonstrates the success of the method on popular real datasets including ProPublicaโs COMPAS dataset. We also propose a new notion of fairness for machine learning and show that our technique satisfies this subjective fairness criterion.
Fair Inference on Outcomes
Nabi, Razieh (Johns Hopkins University) | Shpitser, Ilya (Johns Hopkins University)
In this paper, we consider the problem of fair statistical inference involving outcome variables. Examples include classification and regression problems, and estimating treatment effects in randomized trials or observational data. The issue of fairness arises in such problems where some covariates or treatments are "sensitive," in the sense of having potential of creating discrimination. In this paper, we argue that the presence of discrimination can be formalized in a sensible way as the presence of an effect of a sensitive covariate on the outcome along certain causal pathways, a view which generalizes (Pearl 2009). A fair outcome model can then be learned by solving a constrained optimization problem. We discuss a number of complications that arise in classical statistical inference due to this view and provide workarounds based on recent work in causal and semi-parametric inference.
AdaFlock: Adaptive Feature Discovery for Human-in-the-loop Predictive Modeling
Takahama, Ryusuke (scouty Inc.) | Baba, Yukino (Kyoto University) | Shimizu, Nobuyuki (Yahoo Japan Corporation) | Fujita, Sumio (Yahoo Japan Corporation) | Kashima, Hisashi (Kyoto University)
Feature engineering is the key to successful application of machine learning algorithms to real-world data. The discovery of informative features often requires domain knowledge or human inspiration, and data scientists expend a certain amount of effort into exploring feature spaces. Crowdsourcing is considered a promising approach for allowing many people to be involved in feature engineering; however, there is a demand for a sophisticated strategy that enables us to acquire good features at a reasonable crowdsourcing cost. In this paper, we present a novel algorithm called AdaFlock to efficiently obtain informative features through crowdsourcing. AdaFlock is inspired by AdaBoost, which iteratively trains classifiers by increasing the weights of samples misclassified by previous classifiers. AdaFlock iteratively generates informative features; at each iteration of AdaFlock, crowdsourcing workers are shown samples selected according to the classification errors of the current classifiers and are asked to generate new features that are helpful for correctly classifying the given examples. The results of our experiments conducted using real datasets indicate that AdaFlock successfully discovers informative features with fewer iterations and achieves high classification accuracy.
PVL: A Framework for Navigating the Precision-Variety Trade-Off in Automated Animation of Smiles
Sohre, Nicholas (University of Minnesota) | Adeagbo, Moses (University of Minnesota) | Helwig, Nathaniel (University of Minnesota) | Lyford-Pike, Sofia (University of Minnesota) | Guy, Stephen J. (University of Minnesota)
Animating digital characters has an important role in computer assisted experiences, from video games to movies to interactive robotics. A critical challenge in the field is to generate animations which accurately reflect the state of the animated characters, without looking repetitive or unnatural. In this work, we investigate the problem of procedurally generating a diverse variety of facial animations that express a given semantic quality (e.g., very happy). To that end, we introduce a new learning heuristic called Precision Variety Learning (PVL) which actively identifies and exploits the fundamental trade-off between precision (how accurate positive labels are) and variety (how diverse the set of positive labels is). We both identify conditions where important theoretical properties can be guaranteed, and show good empirical performance in variety of conditions. Lastly, we apply our PVL heuristic to our motivating problem of generating smile animations, and perform several user studies to validate the ability of our method to produce a perceptually diverse variety of smiles for different target intensities.
On Validation and Predictability of Digital Badgesโ Influence on Individual Users
Kuลmierczyk, Tomasz (Norwegian University of Science and Technology) | Nรธrvรฅg, Kjetil (Norwegian University of Science and Technology)
Badges are a common, and sometimes the only, method of incentivizing users to perform certain actions on on- line sites. However, due to many competing factors influencing user temporal dynamics, it is difficult to determine whether the badge had (or will have) the intended effect or not. In this paper, we introduce two complementary approaches for determining badge influence on users. In the first one, we cluster usersโ temporal traces (represented with Poisson processes) and apply covariates (user features) to regularize results. In the second approach, we first classify usersโ temporal traces with a novel statistical framework, and then we refine the classification results with a semi-supervised clustering of covariates. Outcomes obtained from an evaluation on synthetic datasets and experiments on two badges from a pop- ular Q&A platform confirm that it is possible to validate, characterize and to some extent predict users affected by the badge.
Neural Link Prediction over Aligned Networks
Cao, Xuezhi (Shanghai Jiao Tong University) | Chen, Haokun (Shanghai Jiao Tong University) | Wang, Xuejian (Shanghai Jiao Tong University) | Zhang, Weinan (Shanghai Jiao Tong University) | Yu, Yong (Shanghai Jiao Tong University)
Link prediction is a fundamental problem with a wide range of applications in various domains, which predicts the links that are not yet observed or the links that may appear in the future. Most existing works in this field only focus on modeling a single network, while real-world networks are actually aligned with each other. Network alignments contain valuable additional information for understanding the networks, and provide a new direction for addressing data insufficiency and alleviating cold start problem. However, there are rare works leveraging network alignments for better link prediction. Besides, neural network is widely employed in various domains while its capability of capturing high-level patterns and correlations for link prediction problem has not been adequately researched yet. Hence, in this paper we target atlink prediction over aligned networks using neural networks. The major challenge is the heterogeneousness of the considered networks, as the networks may have different characteristics, link purposes, etc. To overcome this, we propose a novel multi-neural-network framework MNN, where we have one individual neural network for each heterogeneous target or feature while the vertex representations are shared. We further discuss training methods for the multi-neural-network framework. Extensive experiments demonstrate that MNN outperforms the state-of-the-art methods and achieves 3% to 5% relative improvement of AUC score across different settings, particularly over 8% for cold start scenarios.
Automated Segmentation of Overlapping Cytoplasm in Cervical Smear Images via Contour Fragments
Song, Youyi (The Hong Kong Polytechnic University) | Qin, Jing (The Hong Kong Polytechnic University) | Lei, Baiying (Shenzhen University) | Choi, Kup-Sze (The Hong Kong Polytechnic University)
We present a novel method for automated segmentation of overlapping cytoplasm in cervical smear images based on contour fragments. We formulate the segmentation problem as a graphical model, and employ the contour fragments generated from cytoplasm clump to construct the graph. Compared with traditional methods that are based on pixels, our contour fragment-based solution can take more geometric information into account and hence generate more accurate prediction of the overlapping boundaries. We further design a novel energy function for the graph, and by minimizing the energy function, fragments that come from the same cytoplasm are selected into the same set. To construct the energy function, our fragments-based data term and pairwise term are measured from the spatial relation and shape prior, which offer more geometric information for the occluded boundary inference. Afterwards, occluded boundaries are inferred using the minimal path model, in which shape of each individual cytoplasm is reconstructed on the selected fragments set. Constructed shape is used as a constraint to locate the searching area, and curvature regulation is enforced to promote the smoothness of inference result. The inference result, in turn, is used as the shape prior to construct a high-level shape regulation energy term of the built graph, and then graph energy is updated. In other words, fragments selection and occluded boundary inference are iterative processed; this interaction makes more potential shape information accessible. Using two cervical smear datasets, the performance of our method is extensively evaluated and compared with that of the state-of-the-art approaches; the results show the superiority of the proposed method.