AITopics | Vernade, Claire

Plotting

Vernade, Claire

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Partially Observable Reinforcement Learning with Memory Traces

Eberhard, Onno, Muehlebach, Michael, Vernade, Claire

arXiv.org Artificial IntelligenceMar-19-2025

Partially observable environments present a considerable computational challenge in reinforcement learning due to the need to consider long histories. Learning with a finite window of observations quickly becomes intractable as the window length grows. In this work, we introduce memory traces. Inspired by eligibility traces, these are compact representations of the history of observations in the form of exponential moving averages. We prove sample complexity bounds for the problem of offline on-policy evaluation that quantify the value errors achieved with memory traces for the class of Lipschitz continuous value estimates. We establish a close connection to the window approach, and demonstrate that, in certain environments, learning with memory traces is significantly more sample efficient. Finally, we underline the effectiveness of memory traces empirically in online reinforcement learning experiments for both value prediction and control.

machine learning, memory trace, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2503.152

Country: Europe > Germany (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Quantization-Free Autoregressive Action Transformer

Sheebaelhamd, Ziyad, Tschannen, Michael, Muehlebach, Michael, Vernade, Claire

arXiv.org Artificial IntelligenceMar-18-2025

Psenka et al., 2023), which will be discussed in the next two paragraphs. Current transformer-based imitation learning approaches introduce discrete action representations Existing autoregressive policies, on the one hand, sidestep and train an autoregressive transformer decoder the challenge of learning in a continuous domain by discretizing on the resulting latent code. However, the initial the actions (Lee et al., 2024; Shafiullah et al., quantization breaks the continuous structure of the 2022). This discretization can introduce several drawbacks: action space thereby limiting the capabilities of It discards the inherent structure of the continuous space, the generative model. We propose a quantizationfree increases complexity by adding a separate quantization step, method instead that leverages Generative and may limit expressiveness or accuracy when fine-grained Infinite-Vocabulary Transformers (GIVT) as a direct, control is required.

artificial intelligence, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2503.14259

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Risk-sensitive Planning via Entropic Risk Measures

Marthe, Alexandre, Bounan, Samuel, Garivier, Aurélien, Vernade, Claire

arXiv.org Machine LearningFeb-27-2025

Risk-sensitive planning aims to identify policies maximizing some tail-focused metrics in Markov Decision Processes (MDPs). Such an optimization task can be very costly for the most widely used and interpretable metrics such as threshold probabilities or (Conditional) Values at Risk. Indeed, previous work showed that only Entropic Risk Measures (EntRM) can be efficiently optimized through dynamic programming, leaving a hard-to-interpret parameter to choose. We show that the computation of the full set of optimal policies for EntRM across parameter values leads to tight approximations for the metrics of interest. We prove that this optimality front can be computed effectively thanks to a novel structural analysis and smoothness properties of entropic risks. Empirical results demonstrate that our approach achieves strong performance in a variety of decision-making scenarios.

artificial intelligence, machine learning, optimal policy, (13 more...)

arXiv.org Machine Learning

2502.20423

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Variational Bayes Portfolio Construction

Nguyen, Nicolas, Ridgway, James, Vernade, Claire

arXiv.org Machine LearningNov-9-2024

Portfolio construction is the science of balancing reward and risk; it is at the core of modern finance. In this paper, we tackle the question of optimal decision-making within a Bayesian paradigm, starting from a decision-theoretic formulation. Despite the inherent intractability of the optimal decision in any interesting scenarios, we manage to rewrite it as a saddle-point problem. Leveraging the literature on variational Bayes (VB), we propose a relaxation of the original problem. This novel methodology results in an efficient algorithm that not only performs well but is also provably convergent. Furthermore, we provide theoretical results on the statistical consistency of the resulting decision with the optimal Bayesian decision. Using real data, our proposal significantly enhances the speed and scalability of portfolio selection problems. We benchmark our results against state-of-the-art algorithms, as well as a Monte Carlo algorithm targeting the optimal decision.

artificial intelligence, decision support system, machine learning, (18 more...)

arXiv.org Machine Learning

2411.06192

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Banking & Finance > Trading (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Decision Support Systems (0.88)

Add feedback

Online Decision Deferral under Budget Constraints

Reid, Mirabel, Sühr, Tom, Vernade, Claire, Samadi, Samira

arXiv.org Artificial IntelligenceSep-30-2024

Machine Learning (ML) models are increasingly used to support or substitute decision making. In applications where skilled experts are a limited resource, it is crucial to reduce their burden and automate decisions when the performance of an ML model is at least of equal quality. However, models are often pre-trained and fixed, while tasks arrive sequentially and their distribution may shift. In that case, the respective performance of the decision makers may change, and the deferral algorithm must remain adaptive. We propose a contextual bandit model of this online decision making problem. Our framework includes budget constraints and different types of partial feedback models. Beyond the theoretical guarantees of our algorithm, we propose efficient extensions that achieve remarkable performance on real-world datasets.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2409.20489

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Pontryagin Perspective on Reinforcement Learning

Eberhard, Onno, Vernade, Claire, Muehlebach, Michael

arXiv.org Artificial IntelligenceMay-28-2024

Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing our algorithms on Bellman's equation from dynamic programming, our work builds on Pontryagin's principle from the theory of open-loop optimal control. We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, demonstrating remarkable performance compared to existing baselines.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2405.181

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Industry: Energy > Renewable (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits

Nguyen, Nicolas, Aouali, Imad, György, András, Vernade, Claire

arXiv.org Artificial IntelligenceFeb-8-2024

Best arm identification (BAI) addresses the challenge of finding the optimal arm in a bandit environment (Lattimore and Szepesvári, 2020), with wide-ranging applications in online advertising, drug discovery or hyperparameter tuning. BAI is commonly approached through two primary paradigms: fixed-confidence and fixed-budget. In the fixed-confidence setting (Even-Dar et al., 2006; Kaufmann et al., 2016), the objective is to find the optimal arm with a pre-specified confidence level. Conversely, fixed-budget BAI (Audibert et al., 2010; Karnin et al., 2013; Carpentier and Locatelli, 2016) involves identifying the optimal arm within a fixed number of observations. Within this fixed-budget context, two main metrics are used: the probability of error (PoE) (Audibert et al., 2010; Karnin et al., 2013; Carpentier and Locatelli, 2016)--the likelihood of incorrectly identifying the optimal arm--and the simple regret (Bubeck et al., 2009; Russo, 2016; Komiyama et al., 2023)--the expected performance disparity between the chosen and the optimal arm.

artificial intelligence, machine learning, pi-bai, (19 more...)

arXiv.org Artificial Intelligence

2402.05878

Country:

Europe > Portugal (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)

Add feedback

Beyond Average Return in Markov Decision Processes

Marthe, Alexandre, Garivier, Aurélien, Vernade, Claire

arXiv.org Artificial IntelligenceOct-31-2023

What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes? In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain classes of statistics. We summarize the characterization of these classes for policy evaluation, and give a new answer for the planning problem. Interestingly, we prove that only generalized means can be optimized exactly, even in the more general framework of Distributional Reinforcement Learning (DistRL). DistRL permits, however, to evaluate other functionals approximately. We provide error bounds on the resulting estimators, and discuss the potential of this approach as well as its limitations. These results contribute to advancing the theory of Markov Decision Processes by examining overall characteristics of the return, and particularly risk-conscious strategies.

dynamic programming, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2310.20266

Country:

Europe > Germany (0.14)
Europe > France (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Add feedback

EigenGame Unloaded: When playing games is better than optimizing

Gemp, Ian, McWilliams, Brian, Vernade, Claire, Graepel, Thore

arXiv.org Artificial IntelligenceFeb-8-2021

We build on the recently proposed EigenGame that views eigendecomposition as a competitive game. EigenGame's updates are biased if computed using minibatches of data, which hinders convergence and more sophisticated parallelism in the stochastic setting. In this work, we propose an unbiased stochastic update that is asymptotically equivalent to EigenGame, enjoys greater parallelism allowing computation on datasets of larger sample sizes, and outperforms EigenGame in experiments. We present applications to finding the principal components of massive datasets and performing spectral clustering of graphs. We analyze and discuss our proposed update in the context of EigenGame and the shift in perspective from optimization to games.

artificial intelligence, eigenvector, social media, (17 more...)

arXiv.org Artificial Intelligence

2102.04152

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Communications > Social Media (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Asymptotically Optimal Information-Directed Sampling

Kirschner, Johannes, Lattimore, Tor, Vernade, Claire, Szepesvári, Csaba

arXiv.org Machine LearningNov-11-2020

We introduce a computationally efficient algorithm for finite stochastic linear bandits. The approach is based on the frequentist information-directed sampling (IDS) framework, with an information gain potential that is derived directly from the asymptotic regret lower bound. We establish frequentist regret bounds, which show that the proposed algorithm is both asymptotically optimal and worst-case rate optimal in finite time. Our analysis sheds light on how IDS trades off regret and information to incrementally solve the semi-infinite concave program that defines the optimal asymptotic regret. Along the way, we uncover interesting connections towards a recently proposed two-player game approach and the Bayesian IDS algorithm.

artificial intelligence, information gain, machine learning, (17 more...)

arXiv.org Machine Learning

2011.05944

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback