AITopics | Stoltz, Gilles

Collaborating Authors

Stoltz, Gilles

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conformal Prediction for Hierarchical Data

Principato, Guillaume, Amara-Ouali, Yvenn, Goude, Yannig, Hamrouche, Bachir, Poggi, Jean-Michel, Stoltz, Gilles

arXiv.org Machine LearningNov-20-2024

Reconciliation has become an essential tool in multivariate point forecasting for hierarchical time series. However, there is still a lack of understanding of the theoretical properties of probabilistic Forecast Reconciliation techniques. Meanwhile, Conformal Prediction is a general framework with growing appeal that provides prediction sets with probabilistic guarantees in finite sample. In this paper, we propose a first step towards combining Conformal Prediction and Forecast Reconciliation by analyzing how including a reconciliation step in the Split Conformal Prediction (SCP) procedure enhances the resulting prediction sets. In particular, we show that the validity granted by SCP remains while improving the efficiency of the prediction sets. We also advocate a variation of the theoretical procedure for practical use. Finally, we illustrate these results with simulations.

artificial intelligence, conformal prediction, machine learning, (16 more...)

arXiv.org Machine Learning

2411.13479

Country: Europe (0.14)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.67)
Transportation > Electric Vehicle (0.67)
Automobiles & Trucks (0.67)

Technology:

Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization

Tiapkin, Daniil, Chzhen, Evgenii, Stoltz, Gilles

arXiv.org Artificial IntelligenceJul-8-2024

In this paper, we consider the problem of learning in adversarial Markov decision processes [MDPs] with an oblivious adversary in a full-information setting. The agent interacts with an environment during $T$ episodes, each of which consists of $H$ stages, and each episode is evaluated with respect to a reward function that will be revealed only at the end of the episode. We propose an algorithm, called APO-MVP, that achieves a regret bound of order $\tilde{\mathcal{O}}(\mathrm{poly}(H)\sqrt{SAT})$, where $S$ and $A$ are sizes of the state and action spaces, respectively. This result improves upon the best-known regret bound by a factor of $\sqrt{S}$, bridging the gap between adversarial and stochastic MDPs, and matching the minimax lower bound $\Omega(\sqrt{H^3SAT})$ as far as the dependencies in $S,A,T$ are concerned. The proposed algorithm and analysis completely avoid the typical tool given by occupancy measures; instead, it performs policy optimization based only on dynamic programming and on a black-box online linear optimization strategy run over estimated advantage functions, making it easy to implement. The analysis leverages two recent techniques: policy optimization based on online linear optimization strategies (Jonckheere et al., 2023) and a refined martingale analysis of the impact on values of estimating transitions kernels (Zhang et al., 2023).

log 2, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2407.05704

Country: Europe (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Small Total-Cost Constraints in Contextual Bandits with Knapsacks, with Application to Fairness

Chzhen, Evgenii, Giraud, Christophe, Li, Zhen, Stoltz, Gilles

arXiv.org Machine LearningOct-26-2023

We consider contextual bandit problems with knapsacks [CBwK], a problem where at each round, a scalar reward is obtained and vector-valued costs are suffered. The learner aims to maximize the cumulative rewards while ensuring that the cumulative costs are lower than some predetermined cost constraints. We assume that contexts come from a continuous set, that costs can be signed, and that the expected reward and cost functions, while unknown, may be uniformly estimated -- a typical assumption in the literature. In this setting, total cost constraints had so far to be at least of order $T^{3/4}$, where $T$ is the number of rounds, and were even typically assumed to depend linearly on $T$. We are however motivated to use CBwK to impose a fairness constraint of equalized average costs between groups: the budget associated with the corresponding cost constraints should be as close as possible to the natural deviations, of order $\sqrt{T}$. To that end, we introduce a dual strategy based on projected-gradient-descent updates, that is able to deal with total-cost constraints of the order of $\sqrt{T}$ up to poly-logarithmic terms. This strategy is more direct and simpler than existing strategies in the literature. It relies on a careful, adaptive, tuning of the step size.

constraint, data mining, machine learning, (22 more...)

arXiv.org Machine Learning

2305.15807

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)
Information Technology > Data Science > Data Mining > Big Data (0.34)

Add feedback

Symphony of experts: orchestration with adversarial insights in reinforcement learning

Jonckheere, Matthieu, Mignacco, Chiara, Stoltz, Gilles

arXiv.org Machine LearningOct-25-2023

Structured reinforcement learning leverages policies with advantageous properties to reach better performance, particularly in scenarios where exploration poses challenges. We explore this field through the concept of orchestration, where a (small) set of expert policies guides decision-making; the modeling thereof constitutes our first contribution. We then establish value-functions regret bounds for orchestration in the tabular setting by transferring regret-bound results from adversarial settings. We generalize and extend the analysis of natural policy gradient in Agarwal et al. [2021, Section 5.3] to arbitrary adversarial aggregation strategies. We also extend it to the case of estimated advantage functions, providing insights into sample complexity both in expectation and high probability. A key point of our approach lies in its arguably more transparent proofs compared to existing methods. Finally, we present simulations for a stochastic matching toy model.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2310.16473

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Parameter-free projected gradient descent

Chzhen, Evgenii, Giraud, Christophe, Stoltz, Gilles

arXiv.org Machine LearningMay-31-2023

We consider the problem of minimizing a convex function over a closed convex set, with Projected Gradient Descent (PGD). We propose a fully parameter-free version of AdaGrad, which is adaptive to the distance between the initialization and the optimum, and to the sum of the square norm of the subgradients. Our algorithm is able to handle projection steps, does not involve restarts, reweighing along the trajectory or additional gradient evaluations compared to the classical PGD. It also fulfills optimal rates of convergence for cumulative regret up to logarithmic factors. We provide an extension of our approach to stochastic optimization and conduct numerical experiments supporting the developed theory.

artificial intelligence, log 2, machine learning, (18 more...)

arXiv.org Machine Learning

2305.19605

Country: Europe > Spain (0.14)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

Barrier, Antoine, Garivier, Aurélien, Stoltz, Gilles

arXiv.org Artificial IntelligenceFeb-6-2023

We lay the foundations of a non-parametric theory of best-arm identification in multi-armed bandits with a fixed budget T. We consider general, possibly non-parametric, models D for distributions over the arms; an overarching example is the model D = P(0,1) of all probability distributions over [0,1]. We propose upper bounds on the average log-probability of misidentifying the optimal arm based on information-theoretic quantities that correspond to infima over Kullback-Leibler divergences between some distributions in D and a given distribution. This is made possible by a refined analysis of the successive-rejects strategy of Audibert, Bubeck, and Munos (2010). We finally provide lower bounds on the same average log-probability, also in terms of the same new information-theoretic quantities; these lower bounds are larger when the (natural) assumptions on the considered strategies are stronger. All these new upper and lower bounds generalize existing bounds based, e.g., on gaps between distributions.

data mining, dentification, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2210.00895

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.81)

Add feedback

Adaptation to the Range in $K$-Armed Bandits

Hadiji, Hédi, Stoltz, Gilles

arXiv.org Machine LearningNov-12-2020

We consider stochastic bandit problems with $K$ arms, each associated with a bounded distribution supported on the range $[m,M]$. We do not assume that the range $[m,M]$ is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which prevents from simultaneously achieving the typical $\ln T$ and \smash{$\sqrt{T}$} bounds. For instance, a \smash{$\sqrt{T}$} distribution-free regret bound may only be achieved if the distribution-dependent regret bounds are at least of order \smash{$\sqrt{T}$}. We exhibit a strategy achieving the rates for regret indicated by the new trade-off.

adaptation, artificial intelligence, big data, (18 more...)

arXiv.org Machine Learning

2006.03378

Country: Europe (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.49)

Add feedback

Diversity-Preserving K-Armed Bandits, Revisited

Hadiji, Hédi, Gerchinovitz, Sébastien, Loubes, Jean-Michel, Stoltz, Gilles

arXiv.org Machine LearningOct-5-2020

We consider the bandit-based framework for diversity-preserving recommendations introduced by Celis et al. (2019), who approached it mainly by a reduction to the setting of linear bandits. We design a UCB algorithm using the specific structure of the setting and show that it enjoys a bounded distribution-dependent regret in the natural cases when the optimal mixed actions put some probability mass on all actions (i.e., when diversity is desirable). Simulations illustrate this fact. We also provide regret lower bounds and briefly discuss distribution-free regret bounds.

artificial intelligence, bandit, big data, (18 more...)

arXiv.org Machine Learning

2010.01874

Country: Europe > France (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

Hierarchical robust aggregation of sales forecasts at aggregated levels in e-commerce, based on exponential smoothing and Holt's linear trend method

Huard, Malo, Garnier, Rémy, Stoltz, Gilles

arXiv.org Machine LearningJun-5-2020

We revisit the interest of classical statistical techniques for sales forecasting like exponential smoothing and extensions thereof (as Holt's linear trend method). We do so by considering ensemble forecasts, given by several instances of these classical techniques tuned with different (sets of) parameters, and by forming convex combinations of the elements of ensemble forecasts over time, in a robust and sequential manner. The machine-learning theory behind this is called "robust online aggregation", or "prediction with expert advice", or "prediction of individual sequences" (see Cesa-Bianchi and Lugosi, 2006). We apply this methodology to a hierarchical data set of sales provided by the e-commerce company Cdiscount and output forecasts at the levels of subsubfamilies, subfamilies and families of items sold, for various forecasting horizons (up to 6-week-ahead). The performance achieved is better than what would be obtained by optimally tuning the classical techniques on a train set and using their forecasts on the test set. The performance is also good from an intrinsic point of view (in terms of mean absolute percentage of error). While getting these better forecasts of sales at the levels of subsubfamilies, subfamilies and families is interesting per se, we also suggest to use them as additional features when forecasting demand at the item level.

artificial intelligence, predictor, survey article, (19 more...)

arXiv.org Machine Learning

2006.03373

Country:

Europe > France (0.28)
North America (0.28)

Genre:

Research Report (0.81)
Overview (0.67)

Industry:

Retail (1.00)
Energy (0.68)
Information Technology > Services > e-Commerce Services (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.86)

Add feedback

Target Tracking for Contextual Bandits: Application to Demand Side Management

Brégère, Margaux, Gaillard, Pierre, Goude, Yannig, Stoltz, Gilles

arXiv.org Machine LearningJan-28-2019

We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses through a regret criterion. We offer $\sqrt{T}$ upper bounds on this regret (up to poly-logarithmic terms), for strategies inspired by standard strategies for contextual bandits (like LinUCB, Li et al., 2010). Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels.

big data, contextual bandit, energy conservation, (22 more...)

arXiv.org Machine Learning

1901.09532

Country: Europe > France (0.14)

Genre: Research Report (0.40)

Industry:

Energy > Power Industry (0.68)
Banking & Finance > Trading (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Communications > Networks > Sensor Networks (0.42)

Add feedback