Goto

Collaborating Authors

 Reeve, Henry WJ


Optimistic bounds for multi-output prediction

arXiv.org Machine Learning

Multi-output prediction represents an important class of problems that includes multi-class classification Crammer and Singer(2001), multi-label classification Tsoumakas and Katakis(2007); Zhang and Zhou(2013), multi-target regression Borchani et al. (2015), label distribution learning Geng (2016), structured regression Cortes et al. (2016) and others, with a wide range of practical applications Xu et al. (2019). Our objective is to provide a general framework for establishing guarantees for multiple-output prediction problems. A fundamental challenge in the statistical learning theory of multi-output prediction problems is to obtain bounds which allow for (i) favourable convergence rate with the sample size, and (ii) favourable dependence of the risk on the dimensionality of the output space. Whilst modern applications of multioutput prediction deal with increasingly large data sets, they also incorporate problems where the target dimensionality is increasingly large. For example, the number of categories in multi-label is often of the order of tens of thousands, an emergent problem referred to as extreme classification Agrawal et al. (2013); Babbar and Schölkopf (2017); Bhatia et al. (2015); Jain et al. (2019).


Minimax rates for cost-sensitive learning on manifolds with approximate nearest neighbours

arXiv.org Machine Learning

We study the approximate nearest neighbour method for cost-sensitive classification on low-dimensional manifolds embedded within a high-dimensional feature space. We determine the minimax learning rates for distributions on a smooth manifold, in a cost-sensitive setting. This generalises a classic result of Audibert and Tsybakov. Building upon recent work of Chaudhuri and Dasgupta we prove that these minimax rates are attained by the approximate nearest neighbour algorithm, where neighbours are computed in a randomly projected low-dimensional space. In addition, we give a bound on the number of dimensions required for the projection which depends solely upon the reach and dimension of the manifold, combined with the regularity of the marginal.


The K-Nearest Neighbour UCB algorithm for multi-armed bandits with covariates

arXiv.org Machine Learning

In this paper we propose and explore the k-Nearest Neighbour UCB algorithm for multi-armed bandits with covariates. We focus on a setting where the covariates are supported on a metric space of low intrinsic dimension, such as a manifold embedded within a high dimensional ambient feature space. The algorithm is conceptually simple and straightforward to implement. The k-Nearest Neighbour UCB algorithm does not require prior knowledge of the either the intrinsic dimension of the marginal distribution or the time horizon. We prove a regret bound for the k-Nearest Neighbour UCB algorithm which is minimax optimal up to logarithmic factors. In particular, the algorithm automatically takes advantage of both low intrinsic dimensionality of the marginal distribution over the covariates and low noise in the data, expressed as a margin condition. In addition, focusing on the case of bounded rewards, we give corresponding regret bounds for the k-Nearest Neighbour KL-UCB algorithm, which is an analogue of the KL-UCB algorithm adapted to the setting of multi-armed bandits with covariates. Finally, we present empirical results which demonstrate the ability of both the k-Nearest Neighbour UCB and k-Nearest Neighbour KL-UCB to take advantage of situations where the data is supported on an unknown sub-manifold of a high-dimensional feature space.