AITopics | Koolen, Wouter M.

Collaborating Authors

Koolen, Wouter M.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Characterizing the First-order Query Complexity of Learning (Approximate) Nash Equilibria in Zero-sum Matrix Games

Hadiji, Hédi, Sachs, Sarah, van Erven, Tim, Koolen, Wouter M.

arXiv.org Machine LearningNov-2-2023

In the first-order query model for zero-sum $K\times K$ matrix games, players observe the expected pay-offs for all their possible actions under the randomized action played by their opponent. This classical model has received renewed interest after the discovery by Rakhlin and Sridharan that $\epsilon$-approximate Nash equilibria can be computed efficiently from $O(\frac{\ln K}{\epsilon})$ instead of $O(\frac{\ln K}{\epsilon^2})$ queries. Surprisingly, the optimal number of such queries, as a function of both $\epsilon$ and $K$, is not known. We make progress on this question on two fronts. First, we fully characterise the query complexity of learning exact equilibria ($\epsilon=0$), by showing that they require a number of queries that is linear in $K$, which means that it is essentially as hard as querying the whole matrix, which can also be done with $K$ queries. Second, for $\epsilon > 0$, the current query complexity upper bound stands at $O(\min(\frac{\ln(K)}{\epsilon} , K))$. We argue that, unfortunately, obtaining a matching lower bound is not possible with existing techniques: we prove that no lower bound can be derived by constructing hard matrices whose entries take values in a known countable set, because such matrices can be fully identified by a single query. This rules out, for instance, reducing to an optimization problem over the hypercube by encoding it as a binary payoff matrix. We then introduce a new technique for lower bounds, which allows us to obtain lower bounds of order $\tilde\Omega(\log(\frac{1}{K\epsilon})$ for any $\epsilon \leq 1 / (cK^4)$, where $c$ is a constant independent of $K$. We further discuss possible future directions to improve on our techniques in order to close the gap with the upper bounds.

machine learning, natural language, query complexity, (16 more...)

arXiv.org Machine Learning

2304.12768

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.84)

Add feedback

MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

van Erven, Tim, Koolen, Wouter M., van der Hoeven, Dirk

arXiv.org Machine LearningFeb-12-2021

We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. We prove this by drawing a connection to the Bernstein condition, which is known to imply fast rates in offline statistical learning. MetaGrad further adapts automatically to the size of the gradients. Its main feature is that it simultaneously considers multiple learning rates, which are weighted directly proportional to their empirical performance on the data using a new meta-algorithm. We provide three versions of MetaGrad. The full matrix version maintains a full covariance matrix and is applicable to learning tasks for which we can afford update time quadratic in the dimension. The other two versions provide speed-ups for high-dimensional learning tasks with an update time that is linear in the dimension: one is based on sketching, the other on running a separate copy of the basic algorithm per coordinate. We evaluate all versions of MetaGrad on benchmark online classification and regression tasks, on which they consistently outperform both online gradient descent and AdaGrad.

algorithm, computer based training, educational technology, (22 more...)

arXiv.org Machine Learning

2102.06622

Country:

Europe > Netherlands (0.28)
North America > United States > California (0.14)

Genre: Research Report (0.81)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Regret Minimization in Heavy-Tailed Bandits

Agrawal, Shubhada, Juneja, Sandeep, Koolen, Wouter M.

arXiv.org Machine LearningFeb-7-2021

We revisit the classic regret-minimization problem in the stochastic multi-armed bandit setting when the arm-distributions are allowed to be heavy-tailed. Regret minimization has been well studied in simpler settings of either bounded support reward distributions or distributions that belong to a single parameter exponential family. We work under the much weaker assumption that the moments of order $(1+\epsilon)$ are uniformly bounded by a known constant B, for some given $\epsilon > 0$. We propose an optimal algorithm that matches the lower bound exactly in the first-order term. We also give a finite-time bound on its regret. We show that our index concentrates faster than the well known truncated or trimmed empirical mean estimators for the mean of heavy-tailed distributions. Computing our index can be computationally demanding. To address this, we develop a batch-based algorithm that is optimal up to a multiplicative constant depending on the batch size. We hence provide a controlled trade-off between statistical optimality and computational cost.

algorithm, big data, survey article, (21 more...)

arXiv.org Machine Learning

2102.03734

Country:

Asia > India (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Add feedback

Optimal Best-Arm Identification Methods for Tail-Risk Measures

Agrawal, Shubhada, Koolen, Wouter M., Juneja, Sandeep

arXiv.org Machine LearningAug-17-2020

Conditional value-at-risk (CVaR) and value-at-risk (VaR) are popular tail-risk measures in finance and insurance industries where often the underlying probability distributions are heavy-tailed. We use the multi-armed bandit best-arm identification framework and consider the problem of identifying the arm-distribution from amongst finitely many that has the smallest CVaR or VaR. We first show that in the special case of arm-distributions belonging to a single-parameter exponential family, both these problems are equivalent to the best mean-arm identification problem, which is widely studied in the literature. This equivalence however is not true in general. We then propose optimal $\delta$-correct algorithms that act on general arm-distributions, including heavy-tailed distributions, that match the lower bound on the expected number of samples needed, asymptotically (as $ \delta$ approaches $0$). En-route, we also develop new non-asymptotic concentration inequalities for certain functions of these risk measures for the empirical distribution, that may have wider applicability.

algorithm, banking & finance, big data, (20 more...)

arXiv.org Machine Learning

2008.07606

Country:

Europe (0.14)
Asia > India (0.14)

Genre: Research Report (0.50)

Industry: Banking & Finance (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Structure Adaptive Algorithms for Stochastic Bandits

Degenne, Rémy, Shao, Han, Koolen, Wouter M.

arXiv.org Machine LearningJul-2-2020

We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent lower bounds) and efficient in that the per-round computational burden is small. We develop asymptotically optimal algorithms from instance-dependent lower-bounds using iterative saddle-point solvers. Our approach generalises recent iterative methods for pure exploration to reward maximisation, where a major challenge arises from the estimation of the sub-optimality gaps and their reciprocals. Still we manage to achieve all the above desiderata. Notably, our technique avoids the computational cost of the full-blown saddle point oracle employed by previous work, while at the same time enabling finite-time regret bounds. Our experiments reveal that our method successfully leverages the structural assumptions, while its regret is at worst comparable to that of vanilla UCB.

algorithm, artificial intelligence, big data, (17 more...)

arXiv.org Machine Learning

2007.00969

Country:

North America > United States (0.28)
Europe > France (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

Koolen, Wouter M., Grünwald, Peter, Erven, Tim van

Neural Information Processing SystemsFeb-14-2020, 15:57:42 GMT

We consider online learning algorithms that guarantee worst-case regret rates in adversarial environments (so they can be deployed safely and will perform robustly), yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance). We quantify the friendliness of stochastic environments by means of the well-known Bernstein (a.k.a. For two recent algorithms (Squint for the Hedge setting and MetaGrad for online convex optimization) we show that the particular form of their data-dependent individual-sequence regret guarantees implies that they adapt automatically to the Bernstein parameters of the stochastic environment. We prove that these algorithms attain fast rates in their respective settings both in expectation and with high probability. Papers published at the Neural Information Processing Systems Conference.

computer based training, educational technology, guarantee and stochastic fast rate, (7 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.74)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.67)

Add feedback

MetaGrad: Multiple Learning Rates in Online Learning

Erven, Tim van, Koolen, Wouter M.

Neural Information Processing SystemsFeb-14-2020, 14:28:33 GMT

In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions. We are interested in designing adaptive methods that can automatically get fast rates in as many such subclasses as possible, without any manual tuning. Previous adaptive methods are able to interpolate between strongly convex and general convex functions. We present a new method, MetaGrad, that adapts to a much broader class of functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. For instance, MetaGrad can achieve logarithmic regret on the unregularized hinge loss, even though it has no curvature, if the data come from a favourable probability distribution.

computer based training, educational technology, metagrad, (10 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)

Add feedback

Non-Asymptotic Pure Exploration by Solving Games

Degenne, Rémy, Koolen, Wouter M., Ménard, Pierre

arXiv.org Machine LearningJun-25-2019

Pure exploration (aka active testing) is the fundamental task of sequentially gathering information to answer a query about a stochastic environment. Good algorithms make few mistakes and take few samples. Lower bounds (for multi-armed bandit models with arms in an exponential family) reveal that the sample complexity is determined by the solution to an optimisation problem. The existing state of the art algorithms achieve asymptotic optimality by solving a plug-in estimate of that optimisation problem at each step. We interpret the optimisation problem as an unknown game, and propose sampling rules based on iterative strategies to estimate and converge to its saddle point. We apply no-regret learners to obtain the first finite confidence guarantees that are adapted to the exponential family and which apply to any pure exploration query and bandit structure. Moreover, our algorithms only use a best response oracle instead of fully solving the optimisation problem.

algorithm, artificial intelligence, big data, (17 more...)

arXiv.org Machine Learning

1906.10431

Country:

Europe (0.28)
North America > United States > New York (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

Lipschitz Adaptivity with Multiple Learning Rates in Online Learning

Mhammedi, Zakaria, Koolen, Wouter M., van Erven, Tim

arXiv.org Machine LearningFeb-27-2019

We aim to design adaptive online learning algorithms that take advantage of any special structure that might be present in the learning task at hand, with as little manual tuning by the user as possible. A fundamental obstacle that comes up in the design of such adaptive algorithms is to calibrate a so-called step-size or learning rate hyperparameter depending on variance, gradient norms, etc. A recent technique promises to overcome this difficulty by maintaining multiple learning rates in parallel. This technique has been applied in the MetaGrad algorithm for online convex optimization and the Squint algorithm for prediction with expert advice. However, in both cases the user still has to provide in advance a Lipschitz hyperparameter that bounds the norm of the gradients. Although this hyperparameter is typically not available in advance, tuning it correctly is crucial: if it is set too small, the methods may fail completely; but if it is taken too large, performance deteriorates significantly. In the present work we remove this Lipschitz hyperparameter by designing new versions of MetaGrad and Squint that adapt to its optimal value automatically. We achieve this by dynamically updating the set of active learning rates. For MetaGrad, we further improve the computational efficiency of handling constraints on the domain of prediction, and we remove the need to specify the number of rounds in advance.

algorithm, computer based training, educational technology, (20 more...)

arXiv.org Machine Learning

1902.10797

Country: Europe > Netherlands (0.28)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)

Add feedback

Pure Exploration with Multiple Correct Answers

Degenne, Rémy, Koolen, Wouter M.

arXiv.org Machine LearningFeb-9-2019

We determine the sample complexity of pure exploration bandit problems with multiple good answers. We derive a lower bound using a new game equilibrium argument. We show how continuity and convexity properties of single-answer problems ensures that the Track-and-Stop algorithm has asymptotically optimal sample complexity. However, that convexity is lost when going to the multiple-answer setting. We present a new algorithm which extends Track-and-Stop to the multiple-answer case and has asymptotic sample complexity matching the lower bound.

big data, game theory, pure exploration, (18 more...)

arXiv.org Machine Learning

1902.03475

Country:

Europe (0.46)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Game Theory (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.49)

Add feedback