Goto

Collaborating Authors

 Learning Management


Community Exploration: From Offline Optimization to Online Learning

Neural Information Processing Systems

We introduce the community exploration problem that has various real-world applications such as online advertising. In the problem, an explorer allocates limited budget to explore communities so as to maximize the number of members he could meet. We provide a systematic study of the community exploration problem, from offline optimization to online learning. For the offline setting where the sizes of communities are known, we prove that the greedy methods for both of non-adaptive exploration and adaptive exploration are optimal. For the online setting where the sizes of communities are not known and need to be learned from the multi-round explorations, we propose an upper confidence'' like algorithm that achieves the logarithmic regret bounds.


Stochastic and Adversarial Online Learning without Hyperparameters

Neural Information Processing Systems

Most online optimization algorithms focus on one of two things: performing well in adversarial settings by adapting to unknown data parameters (such as Lipschitz constants), typically achieving $O(\sqrt{T})$ regret, or performing well in stochastic settings where they can leverage some structure in the losses (such as strong convexity), typically achieving $O(\log(T))$ regret. Algorithms that focus on the former problem hitherto achieved $O(\sqrt{T})$ in the stochastic setting rather than $O(\log(T))$. Here we introduce an online optimization algorithm that achieves $O(\log 4(T))$ regret in a wide class of stochastic settings while gracefully degrading to the optimal $O(\sqrt{T})$ regret in adversarial settings (up to logarithmic factors). Our algorithm does not require any prior knowledge about the data or tuning of parameters to achieve superior performance. Papers published at the Neural Information Processing Systems Conference.


Online Learning for Multivariate Hawkes Processes

Neural Information Processing Systems

We develop a nonparametric and online learning algorithm that estimates the triggering functions of a multivariate Hawkes process (MHP). The approach we take approximates the triggering function $f_{i,j}(t)$ by functions in a reproducing kernel Hilbert space (RKHS), and maximizes a time-discretized version of the log-likelihood, with Tikhonov regularization. Theoretically, our algorithm achieves an $\calO(\log T)$ regret bound. Numerical results show that our algorithm offers a competing performance to that of the nonparametric batch learning algorithm, with a run time comparable to the parametric online learning algorithm. Papers published at the Neural Information Processing Systems Conference.


Dual Space Gradient Descent for Online Learning

Neural Information Processing Systems

One crucial goal in kernel online learning is to bound the model size. Common approaches employ budget maintenance procedures to restrict the model sizes using removal, projection, or merging strategies. Although projection and merging, in the literature, are known to be the most effective strategies, they demand extensive computation whilst removal strategy fails to retain information of the removed vectors. An alternative way to address the model size problem is to apply random features to approximate the kernel function. This allows the model to be maintained directly in the random feature space, hence effectively resolve the curse of kernelization. However, this approach still suffers from a serious shortcoming as it needs to use a high dimensional random feature space to achieve a sufficiently accurate kernel approximation.


Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

Neural Information Processing Systems

We consider online learning algorithms that guarantee worst-case regret rates in adversarial environments (so they can be deployed safely and will perform robustly), yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance). We quantify the friendliness of stochastic environments by means of the well-known Bernstein (a.k.a. For two recent algorithms (Squint for the Hedge setting and MetaGrad for online convex optimization) we show that the particular form of their data-dependent individual-sequence regret guarantees implies that they adapt automatically to the Bernstein parameters of the stochastic environment. We prove that these algorithms attain fast rates in their respective settings both in expectation and with high probability. Papers published at the Neural Information Processing Systems Conference.


Online Learning of Optimal Bidding Strategy in Repeated Multi-Commodity Auctions

Neural Information Processing Systems

We study the online learning problem of a bidder who participates in repeated auctions. With the goal of maximizing his T-period payoff, the bidder determines the optimal allocation of his budget among his bids for $K$ goods at each period. As a bidding strategy, we propose a polynomial-time algorithm, inspired by the dynamic programming approach to the knapsack problem. The proposed algorithm, referred to as dynamic programming on discrete set (DPDS), achieves a regret order of $O(\sqrt{T\log{T}})$. By showing that the regret is lower bounded by $\Omega(\sqrt{T})$ for any strategy, we conclude that DPDS is order optimal up to a $\sqrt{\log{T}}$ term.


Adaptive Online Learning

Neural Information Processing Systems

We propose a general framework for studying adaptive regret bounds in the online learning setting, subsuming model selection and data-dependent bounds. Given a data- or model-dependent bound we ask, "Does there exist some algorithm achieving this bound?" We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability. A cornerstone of our analysis technique is the use of one-sided tail inequalities to bound suprema of offset random processes.Our framework recovers and improves a wide variety of adaptive bounds including quantile bounds, second order data-dependent bounds, and small loss bounds.


MetaGrad: Multiple Learning Rates in Online Learning

Neural Information Processing Systems

In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions. We are interested in designing adaptive methods that can automatically get fast rates in as many such subclasses as possible, without any manual tuning. Previous adaptive methods are able to interpolate between strongly convex and general convex functions. We present a new method, MetaGrad, that adapts to a much broader class of functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. For instance, MetaGrad can achieve logarithmic regret on the unregularized hinge loss, even though it has no curvature, if the data come from a favourable probability distribution.


Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization

Neural Information Processing Systems

In this paper, we consider online F-measure optimization (OFO). Unlike traditional performance metrics (e.g., classification error rate), F-measure is non-decomposable over training examples and is a non-convex function of model parameters, making it much more difficult to be optimized in an online fashion. Most existing results of OFO usually suffer from high memory/computational costs and/or lack statistical consistency guarantee for optimizing F-measure at the population level. To advance OFO, we propose an efficient online algorithm based on simultaneously learning a posterior probability of class and learning an optimal threshold by minimizing a stochastic strongly convex function with unknown strong convexity parameter. A key component of the proposed method is a novel stochastic algorithm with low memory and computational costs, which can enjoy a convergence rate of $\widetilde O(1/\sqrt{n})$ for learning the optimal threshold under a mild condition on the convergence of the posterior probability, where $n$ is the number of processed examples.


Delay-Tolerant Algorithms for Asynchronous Distributed Online Learning

Neural Information Processing Systems

We analyze new online gradient descent algorithms for distributed systems with large delays between gradient computations and the corresponding updates. Using insights from adaptive gradient methods, we develop algorithms that adapt not only to the sequence of gradients, but also to the precise update delays that occur. We first give an impractical algorithm that achieves a regret bound that precisely quantifies the impact of the delays. We then analyze AdaptiveRevision, an algorithm that is efficiently implementable and achieves comparable guarantees. The key algorithmic technique is appropriately and efficiently revising the learning rate used for previous gradient steps.