Goto

Collaborating Authors

 Accuracy


Bootstrapping Upper Confidence Bound

arXiv.org Machine Learning

Upper Confidence Bound (UCB) method is arguably the most celebrated one used in online decision making with partial information feedback. Existing techniques for constructing confidence bounds are typically built upon various concentration inequalities, which thus lead to over-exploration. In this paper, we propose a non-parametric and data-dependent UCB algorithm based on the multiplier bootstrap. To improve its finite sample performance, we further incorporate second-order correction into the above construction. In theory, we derive both problem-dependent and problem-independent regret bounds for multi-armed bandits under a much weaker tail assumption than the standard sub-Gaussianity. Numerical results demonstrate significant regret reductions by our method, in comparison with several baselines in a range of multi-armed and linear bandit problems.


Manifold Graph with Learned Prototypes for Semi-Supervised Image Classification

arXiv.org Machine Learning

Recent advances in semi-supervised learning methods rely on estimating the categories of unlabeled data using a model trained on the labeled data (pseudo-labeling) and using the unlabeled data for various consistency-based regularization. In this work, we propose to explicitly leverage the structure of the data manifold based on a Manifold Graph constructed over the image instances within the feature space. Specifically, we propose an architecture based on graph networks that jointly optimizes feature extraction, graph connectivity, and feature propagation and aggregation to unlabeled data in an end-to-end manner. Further, we present a novel Prototype Generator for producing a diverse set of prototypes that compactly represent each category, which supports feature propagation. To evaluate our method, we first contribute a strong baseline that combines two consistency-based regularizers that already achieves state-of-the-art results especially with fewer labels. We then show that when combined with these regularizers, the proposed method facilitates the propagation of information from generated prototypes to image data to further improve results. We provide extensive qualitative and quantitative experimental results on semi-supervised benchmarks demonstrating the improvements arising from our design and show that our method achieves state-of-the-art performance when compared with existing methods using a single model and comparable with ensemble methods. Specifically, we achieve error rates of 3.35% on SVHN, 8.27% on CIFAR-10, and 33.83% on CIFAR-100. With much fewer labels, we surpass the state of the arts by significant margins of 41% relative error decrease on average.


A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers

arXiv.org Machine Learning

For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify exactly when it is better to use default values and when to tune hyperparameters for each new dataset. Besides, an in-depth analysis is performed to understand what they take into account for their decisions, providing useful insights. An extensive analysis of different categories of meta-features, meta-learners, and setups across 156 datasets is performed. Results show that it is possible to accurately predict when tuning will significantly improve the performance of the induced models. The proposed system reduces the time spent on optimization processes, without reducing the predictive performance of the induced models (when compared with the ones obtained using tuned hyperparameters). We also explain the decision-making process of the meta-learners in terms of linear separability-based hypotheses. Although this analysis is focused on the tuning of Support Vector Machines, it can also be applied to other algorithms, as shown in experiments performed with decision trees.


Understanding artificial intelligence ethics and safety

arXiv.org Artificial Intelligence

A remarkable time of human promise has been ushered in by the convergence of the ever-expanding availability of big data, the soaring speed and stretch of cloud computing platforms, and the advancement of increasingly sophisticated machine learning algorithms. Innovations in AI are already leaving a mark on government by improving the provision of essential social goods and services from healthcare, education, and transportation to food supply, energy, and environmental management. These bounties are likely just the start. The prospect that progress in AI will help government to confront some of its most urgent challenges is exciting, but legitimate worries abound. As with any new and rapidly evolving technology, a steep learning curve means that mistakes and miscalculations will be made and that both unanticipated and harmful impacts will occur. This guide, written for department and delivery leads in the UK public sector and adopted by the British Government in its publication, 'Using AI in the Public Sector,' identifies the potential harms caused by AI systems and proposes concrete, operationalisable measures to counteract them. It stresses that public sector organisations can anticipate and prevent these potential harms by stewarding a culture of responsible innovation and by putting in place governance processes that support the design and implementation of ethical, fair, and safe AI systems. It also highlights the need for algorithmically supported outcomes to be interpretable by their users and made understandable to decision subjects in clear, non-technical, and accessible ways. Finally, it builds out a vision of human-centred and context-sensitive implementation that gives a central role to communication, evidence-based reasoning, situational awareness, and moral justifiability.


Evolutionary Trigger Set Generation for DNN Black-Box Watermarking

arXiv.org Machine Learning

The commercialization of deep learning creates a compelling need for intellectual property (IP) protection. Deep neural network (DNN) watermarking has been proposed as a promising tool to help model owners prove ownership and fight piracy. A popular approach of watermarking is to train a DNN to recognize images with certain \textit{trigger} patterns. In this paper, we propose a novel evolutionary algorithm-based method to generate and optimize trigger patterns. Our method brings a siginificant reduction in false positive rates, leading to compelling proof of ownership. At the same time, it maintains the robustness of the watermark against attacks. We compare our method with the prior art and demonstrate its effectiveness on popular models and datasets.


Efficient structure learning with automatic sparsity selection for causal graph processes

arXiv.org Machine Learning

We propose a novel algorithm for efficiently computing a sparse directed adjacency matrix from a group of time series following a causal graph process. Our solution is scalable for both dense and sparse graphs and automatically selects the LASSO coefficient to obtain an appropriate number of edges in the adjacency matrix. Current state-of-the-art approaches rely on sparse-matrix-computation libraries to scale, and either avoid automatic selection of the LASSO penalty coefficient or rely on the prediction mean squared error, which is not directly related to the correct number of edges. Instead, we propose a cyclical coordinate descent algorithm that employs two new non-parametric error metrics to automatically select the LASSO coefficient. We demonstrate state-of-the-art performance of our algorithm on simulated stochastic block models and a real dataset of stocks from the S\&P$500$.


Meta-Learning Neural Bloom Filters

arXiv.org Machine Learning

There has been a recent trend in training neural networks to replace data structures that have been crafted by hand, with an aim for faster execution, better accuracy, or greater compression. In this setting, a neural data structure is instantiated by training a network over many epochs of its inputs until convergence. In applications where inputs arrive at high throughput, or are ephemeral, training a network from scratch is not practical. This motivates the need for few-shot neural data structures. In this paper we explore the learning of approximate set membership over a set of data in one-shot via meta-learning. We propose a novel memory architecture, the Neural Bloom Filter, which is able to achieve significant compression gains over classical Bloom Filters and existing memory-augmented neural networks.


CRCEN: A Generalized Cost-sensitive Neural Network Approach for Imbalanced Classification

arXiv.org Machine Learning

Classification on imbalanced datasets is a challenging task in real-world applications. Training conventional classification algorithms directly by minimizing classification error in this scenario can compromise model performance for minority class while optimizing performance for majority class. Traditional approaches to the imbalance problem include re-sampling and cost-sensitive methods. In this paper, we propose a neural network model with novel loss function, CRCEN, for imbalanced classification. Based on the weighted version of cross entropy loss, we provide a theoretical relation for model predicted probability, imbalance ratio and the weighting mechanism. To demonstrate the effectiveness of our proposed model, CRCEN is tested on several benchmark datasets and compared with baseline models.


Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective

arXiv.org Machine Learning

Graph neural networks (GNNs) which apply the deep neural networks to graph data have achieved significant performance for the task of semi-supervised node classification. However, only few work has addressed the adversarial robustness of GNNs. In this paper, we first present a novel gradient-based attack method that facilitates the difficulty of tackling discrete graph data. When comparing to current adversarial attacks on GNNs, the results show that by only perturbing a small number of edge perturbations, including addition and deletion, our optimization-based attack can lead to a noticeable decrease in classification performance. Moreover, leveraging our gradient-based attack, we propose the first optimization-based adversarial training for GNNs. Our method yields higher robustness against both different gradient based and greedy attack methods without sacrificing classification accuracy on original graph.


Learning to combine Grammatical Error Corrections

arXiv.org Artificial Intelligence

The field of Grammatical Error Correction (GEC) has produced various systems to deal with focused phenomena or general text editing. We propose an automatic way to combine black-box systems. Our method automatically detects the strength of a system or the combination of several systems per error type, improving precision and recall while optimizing $F$ score directly. We show consistent improvement over the best standalone system in all the configurations tested. This approach also outperforms average ensembling of different RNN models with random initializations. In addition, we analyze the use of BERT for GEC - reporting promising results on this end. We also present a spellchecker created for this task which outperforms standard spellcheckers tested on the task of spellchecking. This paper describes a system submission to Building Educational Applications 2019 Shared Task: Grammatical Error Correction. Combining the output of top BEA 2019 shared task systems using our approach, currently holds the highest reported score in the open phase of the BEA 2019 shared task, improving F0.5 by 3.7 points over the best result reported.