AITopics | Preux, Philippe

Collaborating Authors

Preux, Philippe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Interpretable Reinforcement Learning by Distilling Policies into Programs

Kohler, Hector, Delfosse, Quentin, Radji, Waris, Akrour, Riad, Preux, Philippe

arXiv.org Artificial IntelligenceMar-11-2025

There exist applications of reinforcement learning like medicine where policies need to be ''interpretable'' by humans. User studies have shown that some policy classes might be more interpretable than others. However, it is costly to conduct human studies of policy interpretability. Furthermore, there is no clear definition of policy interpretabiliy, i.e., no clear metrics for interpretability and thus claims depend on the chosen definition. We tackle the problem of empirically evaluating policies interpretability without humans. Despite this lack of clear definition, researchers agree on the notions of ''simulatability'': policy interpretability should relate to how humans understand policy actions given states. To advance research in interpretable reinforcement learning, we contribute a new methodology to evaluate policy interpretability. This new methodology relies on proxies for simulatability that we use to conduct a large-scale empirical evaluation of policy interpretability. We use imitation learning to compute baseline policies by distilling expert neural networks into small programs. We then show that using our methodology to evaluate the baselines interpretability leads to similar conclusions as user studies. We show that increasing interpretability does not necessarily reduce performances and can sometimes increase them. We also show that there is no policy class that better trades off interpretability and performance across tasks making it necessary for researcher to have methodologies for comparing policies interpretability.

interpretability, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2503.08322

Country:

Europe > France (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

IDEQ: an improved diffusion model for the TSP

Basson, Mickael, Preux, Philippe

arXiv.org Artificial IntelligenceDec-18-2024

Recent years have seen a surge in machine learning models to solve combinatorial optimization (CO) problems. The field of combinatorial optimization is a historical field of research and application in computer science. After decades of progress, very efficient algorithms exist to provide exact solutions or approximate solutions to many CO problems. The Traveling Salesman Problem (TSP) stands as a prominent example of this fact: the TSP is very appealing as it is very simple to understand, it has a wide range of applications, and we are able to solve exactly rather large instances of this problem on a mere laptop (an instance defined over a few thousands cities can be solved within one hour), and we have approximate algorithms that are able to find tours that are very close to optimality (LKH3 [4] can solve instances of 40,000 cities in about one hour on a laptop but we have no guarantee that the result is optimal). These facts can not be forgotten when we try to propose alternate approaches to solve the TSP. Because of its appeal, the TSP has also drawn the attention of researchers in deep neural networks in the recent years. If the first attempts had difficulties solving even small TSP instances of a dozen cities, progress has been made. In this paper, we build on these previous works and go a step further.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.13858

Country: Europe > France (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

Kohler, Hector, Delfosse, Quentin, Akrour, Riad, Kersting, Kristian, Preux, Philippe

arXiv.org Artificial IntelligenceMay-23-2024

Deep reinforcement learning agents are prone to goal misalignments. The black-box nature of their policies hinders the detection and correction of such misalignments, and the trust necessary for real-world deployment. So far, solutions learning interpretable policies are inefficient or require many human priors. We propose INTERPRETER, a fast distillation method producing INTerpretable Editable tRee Programs for ReinforcEmenT lEaRning. We empirically demonstrate that INTERPRETER compact tree programs match oracles across a diverse set of sequential decision tasks and evaluate the impact of our design choices on interpretability and performances. We show that our policies can be interpreted and edited to correct misalignments on Atari games and to explain real farming strategies.

interpreter, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2405.14956

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Arizona (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (1.00)
Food & Agriculture > Agriculture (0.93)
Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop

Kohler, Hector, Delfosse, Quentin, Festor, Paul, Preux, Philippe

arXiv.org Artificial IntelligenceApr-16-2024

Embracing the pursuit of intrinsically explainable reinforcement learning raises crucial questions: what distinguishes explainability from interpretability? Should explainable and interpretable agents be developed outside of domains where transparency is imperative? What advantages do interpretable policies offer over neural networks? How can we rigorously define and measure interpretability in policies, without user studies? What reinforcement learning paradigms,are the most suited to develop interpretable agents? Can Markov Decision Processes integrate interpretable state representations? In addition to motivate an Interpretable RL community centered around the aforementioned questions, we propose the first venue dedicated to Interpretable RL: the InterpPol Workshop.

machine learning, reinforcement, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2404.10906

Country:

Europe > France (0.15)
Europe > Germany (0.15)

Genre:

Research Report (0.50)
Overview (0.47)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

PAQA: Toward ProActive Open-Retrieval Question Answering

Erbacher, Pierre, Nie, Jian-Yun, Preux, Philippe, Soulier, Laure

arXiv.org Artificial IntelligenceFeb-26-2024

Conversational systems have made significant progress in generating natural language responses. However, their potential as conversational search systems is currently limited due to their passive role in the information-seeking process. One major limitation is the scarcity of datasets that provide labelled ambiguous questions along with a supporting corpus of documents and relevant clarifying questions. This work aims to tackle the challenge of generating relevant clarifying questions by taking into account the inherent ambiguities present in both user queries and documents. To achieve this, we propose PAQA, an extension to the existing AmbiNQ dataset, incorporating clarifying questions. We then evaluate various models and assess how passage retrieval impacts ambiguity detection and the generation of clarifying questions. By addressing this gap in conversational search systems, we aim to provide additional supervision to enhance their active participation in the information-seeking process and provide users with more accurate results.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.16608

Country:

North America > United States > New York (0.15)
North America > United States > Texas (0.14)
North America > United States > Pennsylvania (0.14)
North America > United States > Michigan (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Decision Tree Search as a Markov Decision Problem

Kohler, Hector, Akrour, Riad, Preux, Philippe

arXiv.org Artificial IntelligenceJan-22-2024

Finding an optimal decision tree for a supervised learning task is a challenging combinatorial problem to solve at scale. It was recently proposed to frame the problem as a Markov Decision Problem (MDP) and use deep reinforcement learning to tackle scaling. Unfortunately, these methods are not competitive with the current branch-and-bound state-of-the-art. We propose instead to scale the resolution of such MDPs using an information-theoretic tests generating function that heuristically, and dynamically for every state, limits the set of admissible test actions to a few good candidates. As a solver, we show empirically that our algorithm is at the very least competitive with branch-and-bound alternatives. As a machine learning tool, a key advantage of our approach is to solve for multiple complexity-performance trade-offs at virtually no additional cost. With such a set of solutions, a user can then select the tree that generalizes best and which has the interpretability level that best suits their needs, which no current branch-and-bound method allows.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2309.12701

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in IBMDPs

Kohler, Hector, Akrour, Riad, Preux, Philippe

arXiv.org Artificial IntelligenceJan-21-2024

Interpretability of AI models allows for user safety checks to build trust in such AIs. In particular, Decision Trees (DTs) provide a global look at the learned model and transparently reveal which features of the input are critical for making a decision. However, interpretability is hindered if the DT is too large. To learn compact trees, a recent Reinforcement Learning (RL) framework has been proposed to explore the space of DTs using deep RL. This framework augments a decision problem (e.g. a supervised classification task) with additional actions that gather information about the features of an otherwise hidden input. By appropriately penalizing these actions, the agent learns to optimally trade-off size and performance of DTs. In practice, a reactive policy for a partially observable Markov decision process (MDP) needs to be learned, which is still an open problem. We show in this paper that deep RL can fail even on simple toy tasks of this class. However, when the underlying decision problem is a supervised classification task, we show that finding the optimal tree can be cast as a fully observable Markov decision problem and be solved efficiently, giving rise to a new family of algorithms for learning DTs that go beyond the classical greedy maximization ones.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2309.13365

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Reinforcement-learning robotic sailboats: simulator and preliminary results

Vasconcellos, Eduardo Charles, Sampaio, Ronald M, Araújo, André P D, Clua, Esteban Walter Gonzales, Preux, Philippe, Guerra, Raphael, Gonçalves, Luiz M G, Martí, Luis, Lira, Hernan, Sanchez-Pi, Nayat

arXiv.org Artificial IntelligenceJan-16-2024

This work focuses on the main challenges and problems in developing a virtual oceanic environment reproducing real experiments using Unmanned Surface Vehicles (USV) digital twins. We introduce the key features for building virtual worlds, considering using Reinforcement Learning (RL) agents for autonomous navigation and control. With this in mind, the main problems concern the definition of the simulation equations (physics and mathematics), their effective implementation, and how to include strategies for simulated control and perception (sensors) to be used with RL. We present the modeling, implementation steps, and challenges required to create a functional digital twin based on a real robotic sailing vessel. The application is immediate for developing navigation algorithms based on RL to be applied on real boats.

machine learning, reinforcement learning, sailing robot, (17 more...)

arXiv.org Artificial Intelligence

2402.03337

Country:

South America > Brazil > Rio de Janeiro > Niterói (0.16)
South America > Brazil > Rio Grande do Norte (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology (0.47)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Development and validation of an interpretable machine learning-based calculator for predicting 5-year weight trajectories after bariatric surgery: a multinational retrospective cohort SOPHIA study

Saux, Patrick, Bauvin, Pierre, Raverdy, Violeta, Teigny, Julien, Verkindt, Hélène, Soumphonphakdy, Tomy, Debert, Maxence, Jacobs, Anne, Jacobs, Daan, Monpellier, Valerie, Lee, Phong Ching, Lim, Chin Hong, Andersson-Assarsson, Johanna C, Carlsson, Lena, Svensson, Per-Arne, Galtier, Florence, Dezfoulian, Guelareh, Moldovanu, Mihaela, Andrieux, Severine, Couster, Julien, Lepage, Marie, Lembo, Erminia, Verrastro, Ornella, Robert, Maud, Salminen, Paulina, Mingrone, Geltrude, Peterli, Ralph, Cohen, Ricardo V, Zerrweck, Carlos, Nocca, David, Roux, Carel W Le, Caiazzo, Robert, Preux, Philippe, Pattou, François

arXiv.org Artificial IntelligenceAug-31-2023

Background Weight loss trajectories after bariatric surgery vary widely between individuals, and predicting weight loss before the operation remains challenging. We aimed to develop a model using machine learning to provide individual preoperative prediction of 5-year weight loss trajectories after surgery. Methods In this multinational retrospective observational study we enrolled adult participants (aged $\ge$18 years) from ten prospective cohorts (including ABOS [NCT01129297], BAREVAL [NCT02310178], the Swedish Obese Subjects study, and a large cohort from the Dutch Obesity Clinic [Nederlandse Obesitas Kliniek]) and two randomised trials (SleevePass [NCT00793143] and SM-BOSS [NCT00356213]) in Europe, the Americas, and Asia, with a 5 year followup after Roux-en-Y gastric bypass, sleeve gastrectomy, or gastric band. Patients with a previous history of bariatric surgery or large delays between scheduled and actual visits were excluded. The training cohort comprised patients from two centres in France (ABOS and BAREVAL). The primary outcome was BMI at 5 years. A model was developed using least absolute shrinkage and selection operator to select variables and the classification and regression trees algorithm to build interpretable regression trees. The performances of the model were assessed through the median absolute deviation (MAD) and root mean squared error (RMSE) of BMI. Findings10 231 patients from 12 centres in ten countries were included in the analysis, corresponding to 30 602 patient-years. Among participants in all 12 cohorts, 7701 (75$\bullet$3%) were female, 2530 (24$\bullet$7%) were male. Among 434 baseline attributes available in the training cohort, seven variables were selected: height, weight, intervention type, age, diabetes status, diabetes duration, and smoking status. At 5 years, across external testing cohorts the overall mean MAD BMI was 2$\bullet$8 kg/m${}^2$ (95% CI 2$\bullet$6-3$\bullet$0) and mean RMSE BMI was 4$\bullet$7 kg/m${}^2$ (4$\bullet$4-5$\bullet$0), and the mean difference between predicted and observed BMI was-0$\bullet$3 kg/m${}^2$ (SD 4$\bullet$7). This model is incorporated in an easy to use and interpretable web-based prediction tool to help inform clinical decision before surgery. InterpretationWe developed a machine learning-based model, which is internationally validated, for predicting individual 5-year weight loss trajectories after three common bariatric interventions.

artificial intelligence, interpretable machine learning-based calculator, multinational retrospective cohort sophia study, (4 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/S2589-7500(23)00135-8

2308.16585

Country: Europe > France (0.24)

Genre:

Research Report > Strength Medium (0.53)
Research Report > Experimental Study (0.53)

Industry:

Health & Medicine > Therapeutic Area > Nutrition and Weight Loss (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.73)

Add feedback

AdaStop: sequential testing for efficient and reliable comparisons of Deep RL Agents

Mathieu, Timothée, Della Vecchia, Riccardo, Shilova, Alena, de Medeiros, Matheus Centa, Kohler, Hector, Maillard, Odalric-Ambrym, Preux, Philippe

arXiv.org Artificial IntelligenceJun-19-2023

The reproducibility of many experimental results in Deep Reinforcement Learning (RL) is under question. To solve this reproducibility crisis, we propose a theoretically sound methodology to compare multiple Deep RL algorithms. The performance of one execution of a Deep RL algorithm is random so that independent executions are needed to assess it precisely. When comparing several RL algorithms, a major question is how many executions must be made and how can we assure that the results of such a comparison is theoretically sound. Researchers in Deep RL often use less than 5 independent executions to compare algorithms: we claim that this is not enough in general. Moreover, when comparing several algorithms at once, the error of each comparison accumulates and must be taken into account with a multiple tests procedure to preserve low error guarantees. To address this problem in a statistically sound way, we introduce AdaStop, a new statistical test based on multiple group sequential tests. When comparing algorithms, AdaStop adapts the number of executions to stop as early as possible while ensuring that we have enough information to distinguish algorithms that perform better than the others in a statistical significant way. We prove both theoretically and empirically that AdaStop has a low probability of making an error (Family-Wise Error). Finally, we illustrate the effectiveness of AdaStop in multiple use-cases, including toy examples and difficult cases such as Mujoco environments.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2306.10882

Country: Europe > France (0.14)

Genre: Research Report > Experimental Study (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback