AITopics | Perchet, Vianney

Collaborating Authors

Perchet, Vianney

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pareto-Optimality, Smoothness, and Stochasticity in Learning-Augmented One-Max-Search

Benomar, Ziyad, Croissant, Lorenzo, Perchet, Vianney, Angelopoulos, Spyros

arXiv.org Artificial IntelligenceFeb-8-2025

One-max search is a classic problem in online decision-making, in which a trader acts on a sequence of revealed prices and accepts one of them irrevocably to maximise its profit. The problem has been studied both in probabilistic and in worst-case settings, notably through competitive analysis, and more recently in learning-augmented settings in which the trader has access to a prediction on the sequence. However, existing approaches either lack smoothness, or do not achieve optimal worst-case guarantees: they do not attain the best possible trade-off between the consistency and the robustness of the algorithm. We close this gap by presenting the first algorithm that simultaneously achieves both of these important objectives. Furthermore, we show how to leverage the obtained smoothness to provide an analysis of one-max search in stochastic learning-augmented settings which capture randomness in both the observed prices and the prediction.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.0572

Country:

Europe > France (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Industry:

Banking & Finance > Trading (0.46)
Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Tradeoffs in Learning-Augmented Algorithms

Benomar, Ziyad, Perchet, Vianney

arXiv.org Artificial IntelligenceJan-22-2025

Many decision-making problems under uncertainty are commonly studied using competitive analysis. In this context, the performance of online algorithms, operating under uncertainty, is compared to that of the optimal offline algorithm, which has full knowledge of the problem instance. While competitive analysis provides a rigorous method for evaluating online algorithms, it is often overly pessimistic. In real-world scenarios, decision-makers can have some prior knowledge, though possibly imperfect, about the complete problem instance. For example, predictions of unknown variables might be obtained via machine learning models, or an expert might provide advice on the best course of action. This more realistic setting was formalized by Lykouris and Vassilvtiskii [2018] and Purohit et al. [2018] leading to the development of what is now known as learning-augmented algorithms. In this paradigm, the algorithm receives predictions about the current problem instance, but without any guarantees on their accuracy, and must satisfy three main properties: Consistency: perform almost as well as the optimal offline algorithm if the predictions are perfect. Robustness: maintain a performance level close to the worst-case scenario without predictions when the predictions are arbitrarily bad.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.1277

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improved learning rates in multi-unit uniform price auctions

Potfer, Marius, Baudry, Dorian, Richard, Hugo, Perchet, Vianney, Wan, Cheng

arXiv.org Artificial IntelligenceJan-17-2025

Motivated by the strategic participation of electricity producers in electricity day-ahead market, we study the problem of online learning in repeated multi-unit uniform price auctions focusing on the adversarial opposing bid setting. The main contribution of this paper is the introduction of a new modeling of the bid space. Indeed, we prove that a learning algorithm leveraging the structure of this problem achieves a regret of $\tilde{O}(K^{4/3}T^{2/3})$ under bandit feedback, improving over the bound of $\tilde{O}(K^{7/4}T^{3/4})$ previously obtained in the literature. This improved regret rate is tight up to logarithmic terms. Inspired by electricity reserve markets, we further introduce a different feedback model under which all winning bids are revealed. This feedback interpolates between the full-information and bandit scenarios depending on the auctions' results. We prove that, under this feedback, the algorithm that we propose achieves regret $\tilde{O}(K^{5/2}\sqrt{T})$.

artificial intelligence, auction, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2501.10181

Country: Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Stable Matching with Ties: Approximation Ratios and Learning

Lin, Shiyun, Mauras, Simon, Merlis, Nadav, Perchet, Vianney

arXiv.org Artificial IntelligenceNov-5-2024

We study the problem of matching markets with ties, where one side of the market does not necessarily have strict preferences over members at its other side. For example, workers do not always have strict preferences over jobs, students can give the same ranking for different schools and more. In particular, assume w.l.o.g. that workers' preferences are determined by their utility from being matched to each job, which might admit ties. Notably, in contrast to classical two-sided markets with strict preferences, there is no longer a single stable matching that simultaneously maximizes the utility for all workers. We aim to guarantee each worker the largest possible share from the utility in her best possible stable matching. We call the ratio between the worker's best possible stable utility and its assigned utility the \emph{Optimal Stable Share} (OSS)-ratio. We first prove that distributions over stable matchings cannot guarantee an OSS-ratio that is sublinear in the number of workers. Instead, randomizing over possibly non-stable matchings, we show how to achieve a tight logarithmic OSS-ratio. Then, we analyze the case where the real utility is not necessarily known and can only be approximated. In particular, we provide an algorithm that guarantees a similar fraction of the utility compared to the best possible utility. Finally, we move to a bandit setting, where we select a matching at each round and only observe the utilities for matches we perform. We show how to utilize our results for approximate utilities to gracefully interpolate between problems without ties and problems with statistical ties (small suboptimality gaps).

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.0327

Country:

Europe (0.28)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.45)

Add feedback

Improved Algorithms for Contextual Dynamic Pricing

Tullii, Matilde, Gaucher, Solenne, Merlis, Nadav, Perchet, Vianney

arXiv.org Machine LearningJun-17-2024

In contextual dynamic pricing, a seller sequentially prices goods based on contextual information. Buyers will purchase products only if the prices are below their valuations. The goal of the seller is to design a pricing strategy that collects as much revenue as possible. We focus on two different valuation models. The first assumes that valuations linearly depend on the context and are further distorted by noise. Under minor regularity assumptions, our algorithm achieves an optimal regret bound of $\tilde{\mathcal{O}}(T^{2/3})$, improving the existing results. The second model removes the linearity assumption, requiring only that the expected buyer valuation is $\beta$-H\"older in the context. For this model, our algorithm obtains a regret $\tilde{\mathcal{O}}(T^{d+2\beta/d+3\beta})$, where $d$ is the dimension of the context space.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

2406.11316

Country: Europe (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Non-clairvoyant Scheduling with Partial Predictions

Benomar, Ziyad, Perchet, Vianney

arXiv.org Artificial IntelligenceMay-2-2024

The non-clairvoyant scheduling problem has gained new interest within learning-augmented algorithms, where the decision-maker is equipped with predictions without any quality guarantees. In practical settings, access to predictions may be reduced to specific instances, due to cost or data limitations. Our investigation focuses on scenarios where predictions for only $B$ job sizes out of $n$ are available to the algorithm. We first establish near-optimal lower bounds and algorithms in the case of perfect predictions. Subsequently, we present a learning-augmented algorithm satisfying the robustness, consistency, and smoothness criteria, and revealing a novel tradeoff between consistency and smoothness inherent in the scenario with a restricted number of predictions.

algorithm, artificial intelligence, planning & scheduling, (15 more...)

arXiv.org Artificial Intelligence

2405.01013

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.34)

Add feedback

The Value of Reward Lookahead in Reinforcement Learning

Merlis, Nadav, Baudry, Dorian, Perchet, Vianney

arXiv.org Artificial IntelligenceMar-18-2024

In reinforcement learning (RL), agents sequentially interact with changing environments while aiming to maximize the obtained rewards. Usually, rewards are observed only after acting, and so the goal is to maximize the expected cumulative reward. Yet, in many practical settings, reward information is observed in advance -- prices are observed before performing transactions; nearby traffic information is partially known; and goals are oftentimes given to agents prior to the interaction. In this work, we aim to quantifiably analyze the value of such future reward information through the lens of competitive analysis. In particular, we measure the ratio between the value of standard RL agents and that of agents with partial future-reward lookahead. We characterize the worst-case reward distribution and derive exact ratios for the worst-case reward expectations. Surprisingly, the resulting ratios relate to known quantities in offline RL and reward-free exploration. We further provide tight bounds for the ratio given the worst-case dynamics. Our results cover the full spectrum between observing the immediate rewards before acting to observing all the rewards before the interaction starts.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2403.11637

Genre: Research Report > New Finding (0.34)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Mode Estimation with Partial Feedback

Arnal, Charles, Cabannes, Vivien, Perchet, Vianney

arXiv.org Machine LearningFeb-20-2024

The combination of lightly supervised pre-training and online fine-tuning has played a key role in recent AI developments. These new learning pipelines call for new theoretical frameworks. In this paper, we formalize core aspects of weakly supervised and active learning with a simple problem: the estimation of the mode of a distribution using partial feedback. We show how entropy coding allows for optimal information acquisition from partial feedback, develop coarse sufficient statistics for mode identification, and adapt bandit algorithms to our new setting. Finally, we combine those contributions into a statistically and computationally efficient solution to our problem.

data mining, log 2, machine learning, (18 more...)

arXiv.org Machine Learning

2402.13079

Country:

Europe (0.14)
North America > United States (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

Trading-off price for data quality to achieve fair online allocation

Molina, Mathieu, Gast, Nicolas, Loiseau, Patrick, Perchet, Vianney

arXiv.org Artificial IntelligenceDec-4-2023

We consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes -- which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this problem as a multi-armed bandit problem where each arm corresponds to the choice of a data source, coupled with the online allocation problem. We propose an algorithm that jointly solves both problems and show that it has a regret bounded by $\mathcal{O}(\sqrt{T})$. A key difficulty is that the rewards received by selecting a source are correlated by the fairness penalty, which leads to a need for randomization (despite a stochastic setting). Our algorithm takes into account contextual information available before the source selection, and can adapt to many different fairness notions. We also show that in some instances, the estimates used can be learned on the fly.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2306.1344

Country: Europe > France (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.45)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Local and adaptive mirror descents in extensive-form games

Fiegel, Côme, Ménard, Pierre, Kozuno, Tadashi, Munos, Rémi, Perchet, Vianney, Valko, Michal

arXiv.org Machine LearningSep-1-2023

We study how to learn $\epsilon$-optimal strategies in zero-sum imperfect information games (IIG) with trajectory feedback. In this setting, players update their policies sequentially based on their observations over a fixed number of episodes, denoted by $T$. Existing procedures suffer from high variance due to the use of importance sampling over sequences of actions (Steinberger et al., 2020; McAleer et al., 2022). To reduce this variance, we consider a fixed sampling approach, where players still update their policies over time, but with observations obtained through a given fixed sampling policy. Our approach is based on an adaptive Online Mirror Descent (OMD) algorithm that applies OMD locally to each information set, using individually decreasing learning rates and a regularized loss. We show that this approach guarantees a convergence rate of $\tilde{\mathcal{O}}(T^{-1/2})$ with high probability and has a near-optimal dependence on the game parameters when applied with the best theoretical choices of learning rates and sampling policies. To achieve these results, we generalize the notion of OMD stabilization, allowing for time-varying regularization with convex increments.

artificial intelligence, information, machine learning, (16 more...)

arXiv.org Machine Learning

2309.00656

Country:

North America > United States > Colorado (0.14)
Europe > Italy > Sicily (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback