AITopics | Vernade, Claire

Collaborating Authors

Vernade, Claire

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting

Kuzborskij, Ilja, Vernade, Claire, György, András, Szepesvári, Csaba

arXiv.org Machine LearningNov-1-2020

We consider off-policy evaluation in the contextual bandit setting for the purpose of obtaining a robust off-policy selection strategy, where the selection strategy is evaluated based on the value of the chosen policy in a set of proposal (target) policies. We propose a new method to compute a lower bound on the value of an arbitrary target policy given some logged data in contextual bandits for a desired coverage. The lower bound is built around the so-called Self-normalized Importance Weighting (SN) estimator. It combines the use of a semi-empirical Efron-Stein tail inequality to control the concentration and Harris' inequality to control the bias. The new approach is evaluated on a number of synthetic and real datasets and is found to be superior to its main competitors, both in terms of tightness of the confidence intervals and the quality of the policies chosen.

artificial intelligence, data mining, estimator, (13 more...)

arXiv.org Machine Learning

2006.1046

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

The Elliptical Potential Lemma Revisited

Carpentier, Alexandra, Vernade, Claire, Abbasi-Yadkori, Yasin

arXiv.org Machine LearningOct-20-2020

This note proposes a new proof and new perspectives on the so-called Elliptical Potential Lemma. This result is important in online learning, especially for linear stochastic bandits. The original proof of the result, however short and elegant, does not give much flexibility on the type of potentials considered and we believe that this new interpretation can be of interest for future research in this field.

artificial intelligence, elliptical potential lemma, machine learning, (12 more...)

arXiv.org Machine Learning

2010.10182

Country: Europe > Germany > Saxony-Anhalt (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

EigenGame: PCA as a Nash Equilibrium

Gemp, Ian, McWilliams, Brian, Vernade, Claire, Graepel, Thore

arXiv.org Machine LearningOct-1-2020

We present a novel view on principal component analysis (PCA) as a competitive game in which each approximate eigenvector is controlled by a player whose goal is to maximize their own utility function. We analyze the properties of this PCA game and the behavior of its gradient based updates. The resulting algorithm--which combines elements from Oja's rule with a generalized Gram-Schmidt orthogonalization--is naturally decentralized and hence parallelizable through message passing. We demonstrate the scalability of the algorithm with experiments on large image datasets and neural network activations. We discuss how this new view of PCA as a differentiable game can lead to further algorithmic developments and insights.

deep learning, eigenvector, game theory, (19 more...)

arXiv.org Machine Learning

2010.00554

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Non-Stationary Delayed Bandits with Intermediate Observations

Vernade, Claire, Gyorgy, Andras, Mann, Timothy

arXiv.org Machine LearningAug-11-2020

Delayed feedback in online learning have been addressed Online recommender systems often face long delays both in the full information setting (see, e.g., Joulani et al., in receiving feedback, especially when optimizing 2013, and the references therein), and in the bandit setting for some long-term metrics. While mitigating (see, e.g., Mandel et al., 2015; Vernade et al., 2017; Cesa-the effects of delays in learning is wellunderstood Bianchi et al., 2019, and the references therein), assuming in stationary environments, the problem both stochastic and adversarial environments. The main becomes much more challenging when the takeaway message from these studies is that, for bandits, environment changes. In fact, if the timescale the impact of a constant delay D results in an extra additive of the change is comparable to the delay, it is O( DT) term in the regret in adversarial settings, or an additive impossible to learn about the environment, since O(D) term in stochastic settings. The aforementioned the available observations are already obsolete.

algorithm, artificial intelligence, data mining, (15 more...)

arXiv.org Machine Learning

2006.02119

Genre: Research Report (0.64)

Industry: Education (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Stochastic bandits with arm-dependent delays

Manegueu, Anne Gael, Vernade, Claire, Carpentier, Alexandra, Valko, Michal

arXiv.org Machine LearningJun-18-2020

Significant work has been recently dedicated to the stochastic delayed bandit setting because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and only assume that there is a bound on the tail of the delay. In particular, we cover the important case where the delay distributions vary across arms, and the case where the delays are heavy-tailed. Addressing these difficulties, we propose a simple but efficient UCB-based algorithm called the PatientBandits. We provide both problems-dependent and problems-independent bounds on the regret as well as performance lower bounds.

algorithm, artificial intelligence, data mining, (19 more...)

arXiv.org Machine Learning

2006.10459

Country: Europe > Germany > Saxony-Anhalt (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Weighted Linear Bandits for Non-Stationary Environments

Russac, Yoan, Vernade, Claire, Cappé, Olivier

arXiv.org Machine LearningSep-19-2019

We consider a stochastic linear bandit model in which the available actions correspond to arbitrary context vectors whose associated rewards follow a non-stationary linear regression model. In this setting, the unknown regression parameter is allowed to vary in time. To address this problem, we propose D-LinUCB, a novel optimistic algorithm based on discounted linear regression, where exponential weights are used to smoothly forget the past. This involves studying the deviations of the sequential weighted least-squares estimator under generic assumptions. As a by-product, we obtain novel deviation results that can be used beyond non-stationary environments. We provide theoretical guarantees on the behavior of D-LinUCB in both slowly-varying and abruptly-changing environments. We obtain an upper bound on the dynamic regret that is of order $d^{2/3} B_T^{1/3}T^{2/3}$, where $B_T$ is a measure of non-stationarity (d and T being, respectively, dimension and horizon). This rate is known to be optimal. We also illustrate the empirical performance of D-LinUCB and compare it with recently proposed alternatives in simulated environments.

artificial intelligence, machine learning, null 2, (16 more...)

arXiv.org Machine Learning

1909.09146

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)

Add feedback

Contextual Bandits under Delayed Feedback

Vernade, Claire, Carpentier, Alexandra, Zappella, Giovanni, Ermis, Beyza, Brueckner, Michael

arXiv.org Machine LearningJul-5-2018

Delayed feedback is an ubiquitous problem in many industrial systems employing bandit algorithms. Most of those systems seek to optimize binary indicators as clicks. In that case, when the reward is not sent immediately, the learner cannot distinguish a negative signal from a not-yet-sent positive one: she might be waiting for a feedback that will never come. In this paper, we define and address the contextual bandit problem with delayed and censored feedback by providing a new UCB-based algorithm. In order to demonstrate its effectiveness, we provide a finite time regret analysis and an empirical evaluation that compares it against a baseline commonly used in practice.

algorithm, artificial intelligence, big data, (21 more...)

arXiv.org Machine Learning

1807.02089

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Industry:

Law > Civil Rights & Constitutional Law (0.49)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.88)

Add feedback

Max K-armed bandit: On the ExtremeHunter algorithm and beyond

Achab, Mastane, Clémençon, Stephan, Garivier, Aurélien, Sabourin, Anne, Vernade, Claire

arXiv.org Machine LearningJul-27-2017

This paper is devoted to the study of the max K-armed bandit problem, which consists in sequentially allocating resources in order to detect extreme values. Our contribution is twofold. We first significantly refine the analysis of the ExtremeHunter algorithm carried out in Carpentier and Valko (2014), and next propose an alternative approach, showing that, remarkably, Extreme Bandits can be reduced to a classical version of the bandit problem to a certain extent. Beyond the formal analysis, these two approaches are compared through numerical experiments.

artificial intelligence, big data, carpentier and valko, (18 more...)

arXiv.org Machine Learning

1707.0882

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.88)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Bernoulli Rank-$1$ Bandits for Click Feedback

Katariya, Sumeet, Kveton, Branislav, Szepesvári, Csaba, Vernade, Claire, Wen, Zheng

arXiv.org Machine LearningMar-19-2017

The probability that a user will click a search result depends both on its relevance and its position on the results page. The position based model explains this behavior by ascribing to every item an attraction probability, and to every position an examination probability. To be clicked, a result must be both attractive and examined. The probabilities of an item-position pair being clicked thus form the entries of a rank-$1$ matrix. We propose the learning problem of a Bernoulli rank-$1$ bandit where at each step, the learning agent chooses a pair of row and column arms, and receives the product of their Bernoulli-distributed values as a reward. This is a special case of the stochastic rank-$1$ bandit problem considered in recent work that proposed an elimination based algorithm Rank1Elim, and showed that Rank1Elim's regret scales linearly with the number of rows and columns on "benign" instances. These are the instances where the minimum of the average row and column rewards $\mu$ is bounded away from zero. The issue with Rank1Elim is that it fails to be competitive with straightforward bandit strategies as $\mu \rightarrow 0$. In this paper we propose Rank1ElimKL which simply replaces the (crude) confidence intervals of Rank1Elim with confidence intervals based on Kullback-Leibler (KL) divergences, and with the help of a novel result concerning the scaling of KL divergences we prove that with this change, our algorithm will be competitive no matter the value of $\mu$. Experiments with synthetic data confirm that on benign instances the performance of Rank1ElimKL is significantly better than that of even Rank1Elim, while experiments with models derived from real data confirm that the improvements are significant across the board, regardless of whether the data is benign or not.

algorithm, artificial intelligence, big data, (18 more...)

arXiv.org Machine Learning

1703.06513

Country:

North America > United States > Wisconsin (0.14)
North America > Canada > Alberta (0.14)

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Add feedback

Stochastic Rank-1 Bandits

Katariya, Sumeet, Kveton, Branislav, Szepesvari, Csaba, Vernade, Claire, Wen, Zheng

arXiv.org Machine LearningMar-8-2017

We propose stochastic rank-$1$ bandits, a class of online learning problems where at each step a learning agent chooses a pair of row and column arms, and receives the product of their values as a reward. The main challenge of the problem is that the individual values of the row and column are unobserved. We assume that these values are stochastic and drawn independently. We propose a computationally-efficient algorithm for solving our problem, which we call Rank1Elim. We derive a $O((K + L) (1 / \Delta) \log n)$ upper bound on its $n$-step regret, where $K$ is the number of rows, $L$ is the number of columns, and $\Delta$ is the minimum of the row and column gaps; under the assumption that the mean row and column rewards are bounded away from zero. To the best of our knowledge, we present the first bandit algorithm that finds the maximum entry of a rank-$1$ matrix whose regret is linear in $K + L$, $1 / \Delta$, and $\log n$. We also derive a nearly matching lower bound. Finally, we evaluate Rank1Elim empirically on multiple problems. We observe that it leverages the structure of our problems and can learn near-optimal solutions even if our modeling assumptions are mildly violated.

big data, optimization problem, rank1elim, (18 more...)

arXiv.org Machine Learning

1608.03023

Country:

North America > Canada > Alberta (0.14)
North America > United States > Wisconsin (0.14)

Genre: Research Report (0.64)

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback