Goto

Collaborating Authors

 Accuracy


Sparse Modeling-Based Sequential Ensemble Learning for Effective Outlier Detection in High-Dimensional Numeric Data

AAAI Conferences

The large proportion of irrelevant or noisy features in real-life high-dimensional data presents a significant challenge to subspace/feature selection-based high-dimensional outlier detection (a.k.a. outlier scoring) methods. These methods often perform the two dependent tasks: relevant feature subset search and outlier scoring independently, consequently retaining features/subspaces irrelevant to the scoring method and downgrading the detection performance. This paper introduces a novel sequential ensemble-based framework SEMSE and its instance CINFO to address this issue. SEMSE learns the sequential ensembles to mutually refine feature selection and outlier scoring by iterative sparse modeling with outlier scores as the pseudo target feature. CINFO instantiates SEMSE by using three successive recurrent components to build such sequential ensembles. Given outlier scores output by an existing outlier scoring method on a feature subset, CINFO first defines a Cantelli's inequality-based outlier thresholding function to select outlier candidates with a false positive upper bound. It then performs lasso-based sparse regression by treating the outlier scores as the target feature and the original features as predictors on the outlier candidate set to obtain a feature subset that is tailored for the outlier scoring method. Our experiments show that two different outlier scoring methods enabled by CINFO (i) perform significantly better on 11 real-life high-dimensional data sets, and (ii) have much better resilience to noisy features, compared to their bare versions and three state-of-the-art competitors. The source code of CINFO is available at https://sites.google.com/site/gspangsite/sourcecode.


Matrix Variate Gaussian Mixture Distribution Steered Robust Metric Learning

AAAI Conferences

Mahalanobis Metric Learning (MML) has been actively studied recently in machine learning community. Most of existing MML methods aim to learn a powerful Mahalanobis distance for computing similarity of two objects. More recently, multiple methods use matrix norm regularizers to constrain the learned distance matrixMto improve the performance. However, in real applications, the structure of the distance matrix M is complicated and cannot be characterized well by the simple matrix norm. In this paper, we propose a novel robust metric learning method with learning the structure of the distance matrix in a new and natural way. We partition M into blocks and consider each block as a random matrix variate, which is fitted by matrix variate Gaussian mixture distribution. Different from existing methods, our model has no any assumption on M and automatically learns the structure of M from the real data, where the distance matrix M often is neither sparse nor low-rank. We design an effective algorithm to optimize the proposed model and establish the corresponding theoretical guarantee. We conduct extensive evaluations on the real-world data. Experimental results show our method consistently outperforms the related state-of-the-art methods.


Batchwise Patching of Classifiers

AAAI Conferences

In this work we present classifier patching, an approach for adapting an existing black-box classification model to new data. Instead of creating a new model, patching infers regions in the instance space where the existing model is error-prone by training a classifier on the previously misclassified data. It then learns a specific model to determine the error regions, which allows to patch the old model’s predictions for them. Patching relies on a strong, albeit unchangeable, existing base classifier, and the idea that the true labels of seen instances will be available in batches at some point in time after the original classification. We experimentally evaluate our approach, and show that it meets the original design goals. Moreover, we compare our approach to existing methods from the domain of ensemble stream classification in both concept drift and transfer learning situations. Patching adapts quickly and achieves high classification accuracy, outperforming state-of-the-art competitors in either adaptation speed or accuracy in many scenarios.


Non-Discriminatory Machine Learning Through Convex Fairness Criteria

AAAI Conferences

Biased decision making by machine learning systems is increasingly recognized as an important issue. Recently, techniques have been proposed to learn non-discriminatory clas- sifiers by enforcing constraints in the training phase. Such constraints are either non-convex in nature (posing computational difficulties) or don’t have a clear probabilistic interpretation. Moreover, the techniques offer little understanding of the more subjective notion of fairness. In this paper, we introduce a novel technique to achieve non-discrimination without sacrificing convexity and probabilistic interpretation. Our experimental analysis demonstrates the success of the method on popular real datasets including ProPublica’s COMPAS dataset. We also propose a new notion of fairness for machine learning and show that our technique satisfies this subjective fairness criterion.


Fair Inference on Outcomes

AAAI Conferences

In this paper, we consider the problem of fair statistical inference involving outcome variables. Examples include classification and regression problems, and estimating treatment effects in randomized trials or observational data. The issue of fairness arises in such problems where some covariates or treatments are "sensitive," in the sense of having potential of creating discrimination. In this paper, we argue that the presence of discrimination can be formalized in a sensible way as the presence of an effect of a sensitive covariate on the outcome along certain causal pathways, a view which generalizes (Pearl 2009). A fair outcome model can then be learned by solving a constrained optimization problem. We discuss a number of complications that arise in classical statistical inference due to this view and provide workarounds based on recent work in causal and semi-parametric inference.


AdaFlock: Adaptive Feature Discovery for Human-in-the-loop Predictive Modeling

AAAI Conferences

Feature engineering is the key to successful application of machine learning algorithms to real-world data. The discovery of informative features often requires domain knowledge or human inspiration, and data scientists expend a certain amount of effort into exploring feature spaces. Crowdsourcing is considered a promising approach for allowing many people to be involved in feature engineering; however, there is a demand for a sophisticated strategy that enables us to acquire good features at a reasonable crowdsourcing cost. In this paper, we present a novel algorithm called AdaFlock to efficiently obtain informative features through crowdsourcing. AdaFlock is inspired by AdaBoost, which iteratively trains classifiers by increasing the weights of samples misclassified by previous classifiers. AdaFlock iteratively generates informative features; at each iteration of AdaFlock, crowdsourcing workers are shown samples selected according to the classification errors of the current classifiers and are asked to generate new features that are helpful for correctly classifying the given examples. The results of our experiments conducted using real datasets indicate that AdaFlock successfully discovers informative features with fewer iterations and achieves high classification accuracy.


PVL: A Framework for Navigating the Precision-Variety Trade-Off in Automated Animation of Smiles

AAAI Conferences

Animating digital characters has an important role in computer assisted experiences, from video games to movies to interactive robotics. A critical challenge in the field is to generate animations which accurately reflect the state of the animated characters, without looking repetitive or unnatural. In this work, we investigate the problem of procedurally generating a diverse variety of facial animations that express a given semantic quality (e.g., very happy). To that end, we introduce a new learning heuristic called Precision Variety Learning (PVL) which actively identifies and exploits the fundamental trade-off between precision (how accurate positive labels are) and variety (how diverse the set of positive labels is). We both identify conditions where important theoretical properties can be guaranteed, and show good empirical performance in variety of conditions. Lastly, we apply our PVL heuristic to our motivating problem of generating smile animations, and perform several user studies to validate the ability of our method to produce a perceptually diverse variety of smiles for different target intensities.


On Validation and Predictability of Digital Badges’ Influence on Individual Users

AAAI Conferences

Badges are a common, and sometimes the only, method of incentivizing users to perform certain actions on on- line sites. However, due to many competing factors influencing user temporal dynamics, it is difficult to determine whether the badge had (or will have) the intended effect or not. In this paper, we introduce two complementary approaches for determining badge influence on users. In the first one, we cluster users’ temporal traces (represented with Poisson processes) and apply covariates (user features) to regularize results. In the second approach, we first classify users’ temporal traces with a novel statistical framework, and then we refine the classification results with a semi-supervised clustering of covariates. Outcomes obtained from an evaluation on synthetic datasets and experiments on two badges from a pop- ular Q&A platform confirm that it is possible to validate, characterize and to some extent predict users affected by the badge.


Neural Link Prediction over Aligned Networks

AAAI Conferences

Link prediction is a fundamental problem with a wide range of applications in various domains, which predicts the links that are not yet observed or the links that may appear in the future. Most existing works in this field only focus on modeling a single network, while real-world networks are actually aligned with each other. Network alignments contain valuable additional information for understanding the networks, and provide a new direction for addressing data insufficiency and alleviating cold start problem. However, there are rare works leveraging network alignments for better link prediction. Besides, neural network is widely employed in various domains while its capability of capturing high-level patterns and correlations for link prediction problem has not been adequately researched yet. Hence, in this paper we target atlink prediction over aligned networks using neural networks. The major challenge is the heterogeneousness of the considered networks, as the networks may have different characteristics, link purposes, etc. To overcome this, we propose a novel multi-neural-network framework MNN, where we have one individual neural network for each heterogeneous target or feature while the vertex representations are shared. We further discuss training methods for the multi-neural-network framework. Extensive experiments demonstrate that MNN outperforms the state-of-the-art methods and achieves 3% to 5% relative improvement of AUC score across different settings, particularly over 8% for cold start scenarios.


Beyond Distributive Fairness in Algorithmic Decision Making: Feature Selection for Procedurally Fair Learning

AAAI Conferences

With widespread use of machine learning methods in numerous domains involving humans, several studies have raised questions about the potential for unfairness towards certain individuals or groups. A number of recent works have proposed methods to measure and eliminate unfairness from machine learning models. However, most of this work has focused on only one dimension of fair decision making: distributive fairness, i.e., the fairness of the decision outcomes. In this work, we leverage the rich literature on organizational justice and focus on another dimension of fair decision making: procedural fairness, i.e., the fairness of the decision making process. We propose measures for procedural fairness that consider the input features used in the decision process, and evaluate the moral judgments of humans regarding the use of these features. We operationalize these measures on two real world datasets using human surveys on the Amazon Mechanical Turk (AMT) platform, demonstrating that our measures capture important properties of procedurally fair decision making. We provide fast submodular mechanisms to optimize the tradeoff between procedural fairness and prediction accuracy. On our datasets, we observe empirically that procedural fairness may be achieved with little cost to outcome fairness, but that some loss of accuracy is unavoidable.