Country
Compositional Hierarchical Tensor Factorization: Representing Hierarchical Intrinsic and Extrinsic Causal Factors
Visual objects are composed of a recursive hierarchy of perceptual wholes and parts, whose properties, such as shape, reflectance, and color, constitute a hierarchy of intrinsic causal factors of object appearance. However, object appearance is the compositional consequence of both an object's intrinsic and extrinsic causal factors, where the extrinsic causal factors are related to illumination, and imaging conditions. Therefore, this paper proposes a unified tensor model of wholes and parts, and introduces a compositional hierarchical tensor factorization that disentangles the hierarchical causal structure of object image formation, and subsumes multilinear block tensor decomposition as a special case. The resulting object representation is an interpretable combinatorial choice of wholes' and parts' representations that renders object recognition robust to occlusion and reduces training data requirements. We demonstrate ourapproach in the context of face recognition by training on an extremely reduced dataset of synthetic images, and report encouragingface verification results on two datasets - the Freiburg dataset, andthe Labeled Face in the Wild (LFW) dataset consisting of real world images, thus, substantiating the suitability of our approach for data starved domains.
Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets
Cheng, Ziqiang, Yang, Yang, Wang, Wei, Hu, Wenjie, Zhuang, Yueting, Song, Guojie
Time series modeling has attracted extensive research efforts; however, achieving both reliable efficiency and interpretability from a unified model still remains a challenging problem. Among the literature, shapelets offer interpretable and explanatory insights in the classification tasks, while most existing works ignore the differing representative power at different time slices, as well as (more importantly) the evolution pattern of shapelets. In this paper, we propose to extract time-aware shapelets by designing a two-level timing factor. Moreover, we define and construct the shapelet evolution graph, which captures how shapelets evolve over time and can be incorporated into the time series embeddings by graph embedding algorithms. To validate whether the representations obtained in this way can be applied effectively in various scenarios, we conduct experiments based on three public time series datasets, and two real-world datasets from different domains. Experimental results clearly show the improvements achieved by our approach compared with 17 state-of-the-art baselines.
Higher-order Weighted Graph Convolutional Networks
Liu, Songtao, Chen, Lingwei, Dong, Hanze, Wang, Zihao, Wu, Dinghao, Huang, Zengfeng
Graph Convolution Network (GCN) has been recognized as one of the most effective graph models for semi-supervised learning, but it extracts merely the first-order or few-order neighborhood information through information propagation, which suffers performance drop-off for deeper structure. Existing approaches that deal with the higher-order neighbors tend to take advantage of adjacency matrix power. In this paper, we assume a seemly trivial condition that the higher-order neighborhood information may be similar to that of the first-order neighbors. Accordingly, we present an unsupervised approach to describe such similarities and learn the weight matrices of higher-order neighbors automatically through Lasso that minimizes the feature loss between the first-order and higher-order neighbors, based on which we formulate the new convolutional filter for GCN to learn the better node representations. Our model, called higher-order weighted GCN(HWGCN), has achieved the state-of-the-art results on a number of node classification tasks over Cora, Citeseer and Pubmed datasets.
An empirical study of the relation between network architecture and complexity
Although we lack a rigorous definition of complexity, many agree that certain factors contribute to the complexity in a classification problem including: the dataset size in relation to the dimensionality of the data, the intrinsic ambiguity of the classes, and how compactly the decision boundary can be expressed [1]. The capacity of a network describes the complexity of the functions it can potentially model. Naturally, data with high complexity require a model with high effective capacity. In recent findings counter to the conventional wisdom, Zhang et al. [2] showed that high effective capacity models can both memorize and generalize well whereas Neyshabur et al. [3] showed that networks with higher capacity also generalize better. 1 In this paper, we perform an empirical study to characterize how generalization error relates to network capacity when the complexity of the data changes . Our aim is to improve our understanding of the capacity of various deep neural architectures and potentially help guide the design process. Instead of utilizing theoretical bounds to calculate an architecture's capacity [4] or measures based on the 1 We are concerned with a low generalization error, not a small generalization gap (the difference between test and training error).
Subspace Clustering with Active Learning
Peng, Hankui, Pavlidis, Nicos G.
Nicos G. Pavlidis Department of Management Science Lancaster University Lancaster, UK n.pavlidis@lancaster.ac.uk Abstract --Subspace clustering is a growing field of unsupervised learning that has gained much popularity in the computer vision community. Applications can be found in areas such as motion segmentation and face clustering. It assumes that data originate from a union of subspaces, and clusters the data depending on the corresponding subspace. In practice, it is reasonable to assume that a limited amount of labels can be obtained, potentially at a cost. Therefore, algorithms that can effectively and efficiently incorporate this information to improve the clustering model are desirable. In this paper, we propose an active learning framework for subspace clustering that sequentially queries informative points and updates the subspace model. The query stage of the proposed framework relies on results from the perturbation theory of principal component analysis, to identify influential and potentially misclassified points. A constrained subspace clustering algorithm is proposed that monotonically decreases the objective function subject to the constraints imposed by the labelled data. We show that our proposed framework is suitable for subspace clustering algorithms including iterative methods and spectral methods. Experiments on synthetic data sets, motion segmentation data sets, and Y ale Faces data sets demonstrate the advantage of our proposed active strategy over state-of-the-art. Index T erms --high dimensionality; active learning; subspace clustering; constrained clustering I.
Certified Data Removal from Machine Learning Models
Guo, Chuan, Goldstein, Tom, Hannun, Awni, van der Maaten, Laurens
Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to "remove" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with. We develop a certified-removal mechanism for linear classifiers and empirically study learning settings in which this mechanism is practical.
Secure Federated Submodel Learning
Niu, Chaoyue, Wu, Fan, Tang, Shaojie, Hua, Lifeng, Jia, Rongfei, Lv, Chengfei, Wu, Zhihua, Chen, Guihai
Federated learning was proposed with an intriguing vision of achieving collaborative machine learning among numerous clients without uploading their private data to a cloud server. However, the conventional framework requires each client to leverage the full model for learning, which can be prohibitively inefficient for resource-constrained clients and large-scale deep learning tasks. We thus propose a new framework, called federated submodel learning, where clients download only the needed parts of the full model, namely submodels, and then upload the submodel updates. Nevertheless, the "position" of a client's truly required submodel corresponds to her private data, and its disclosure to the cloud server during interactions inevitably breaks the tenet of federated learning. To integrate efficiency and privacy, we have designed a secure federated submodel learning scheme coupled with a private set union protocol as a cornerstone. Our secure scheme features the properties of randomized response, secure aggregation, and Bloom filter, and endows each client with a customized plausible deniability, in terms of local differential privacy, against the position of her desired submodel, thus protecting her private data. We further instantiated our scheme with the e-commerce recommendation scenario in Alibaba, implemented a prototype system, and extensively evaluated its performance over 30-day Taobao user data. The analysis and evaluation results demonstrate the feasibility and scalability of our scheme from model accuracy and convergency, practical communication, computation, and storage overheads, as well as manifest its remarkable advantages over the conventional federated learning framework.
Adaptive Policies for Perimeter Surveillance Problems
Grant, James A., Leslie, David S., Glazebrook, Kevin, Szechtman, Roberto, Letchford, Adam N.
Maximising the detection of intrusions is a fundamental and often critical aim of perimeter surveillance. Commonly, this requires a decision-maker to optimally allocate multiple searchers to segments of the perimeter. We consider a scenario where the decision-maker may sequentially update the searchers' allocation, learning from the observed data to improve decisions over time. In this work we propose a formal model and solution methods for this sequential perimeter surveillance problem. Our model is a combinatorial multi-armed bandit (CMAB) with Poisson rewards and a novel filtered feedback mechanism - arising from the failure to detect certain intrusions. Our solution method is an upper confidence bound approach and we derive upper and lower bounds on its expected performance. We prove that the gap between these bounds is of constant order, and demonstrate empirically that our approach is more reliable in simulated problems than competing algorithms.
Fairness through Equality of Effort
Huang, Wen, Wu, Yongkai, Zhang, Lu, Wu, Xintao
Fair machine learning is receiving an increasing attention in machine learning fields. Researchers in fair learning have developed correlation or association-based measures such as demographic disparity, mistreatment disparity, calibration, causal-based measures such as total effect, direct and indirect discrimination, and counterfactual fairness, and fairness notions such as equality of opportunity and equal odds that consider both decisions in the training data and decisions made by predictive models. In this paper, we develop a new causal-based fairness notation, called equality of effort. Different from existing fairness notions which mainly focus on discovering the disparity of decisions between two groups of individuals, the proposed equality of effort notation helps answer questions like to what extend a legitimate variable should change to make a particular individual achieve a certain outcome level and addresses the concerns whether the efforts made to achieve the same outcome level for individuals from the protected group and that from the unprotected group are different. We develop algorithms for determining whether an individual or a group of individuals is discriminated in terms of equality of effort. We also develop an optimization-based method for removing discriminatory effects from the data if discrimination is detected. We conduct empirical evaluations to compare the equality of effort and existing fairness notion and show the effectiveness of our proposed algorithms. Introduction Fair machine learning is receiving an increasing attention in machine learning fields. Discrimination is unfair treatment towards individuals based on the group to which they are perceived to belong. The first endeavor of the research community to achieve fairness is developing correlation or association-based measures, including demographic disparity (e.g., risk difference), mistreatment disparity, calibration, etc. (Romei and Ruggieri 2014; Luong, Ruggieri, and Turini 2011; ˇ Zliobaite, Kamiran, and Calders 2011; Dwork et al. 2012; Feldman et al. 2015), which mainly focus on discovering the disparity of certain statistical metrics between two groups of individuals. However, as paid increasing attention recently (Zhang, Wu, and Wu 2017b; Kilbertus et al. 2017; Nabi and Shpitser 2018), unlawful discrimination is a causal connection between the challenged decision and a protected characteristic, which cannot be captured by simple correlation or association concepts.
Learning From Brains How to Regularize Machines
Li, Zhe, Brendel, Wieland, Walker, Edgar Y., Cobos, Erick, Muhammad, Taliah, Reimer, Jacob, Bethge, Matthias, Sinz, Fabian H., Pitkow, Xaq, Tolias, Andreas S.
Despite impressive performance on numerous visual tasks, Convolutional Neural Networks (CNNs) --- unlike brains --- are often highly sensitive to small perturbations of their input, e.g. adversarial noise leading to erroneous decisions. We propose to regularize CNNs using large-scale neuroscience data to learn more robust neural features in terms of representational similarity. We presented natural images to mice and measured the responses of thousands of neurons from cortical visual areas. Next, we denoised the notoriously variable neural activity using strong predictive models trained on this large corpus of responses from the mouse visual system, and calculated the representational similarity for millions of pairs of images from the model's predictions. We then used the neural representation similarity to regularize CNNs trained on image classification by penalizing intermediate representations that deviated from neural ones. This preserved performance of baseline models when classifying images under standard benchmarks, while maintaining substantially higher performance compared to baseline or control models when classifying noisy images. Moreover, the models regularized with cortical representations also improved model robustness in terms of adversarial attacks. This demonstrates that regularizing with neural data can be an effective tool to create an inductive bias towards more robust inference.