AITopics | Thome, Nicolas

Collaborating Authors

Thome, Nicolas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VIPER: Visual Perception and Explainable Reasoning for Sequential Decision-Making

Aissi, Mohamed Salim, Grislain, Clemence, Chetouani, Mohamed, Sigaud, Olivier, Soulier, Laure, Thome, Nicolas

arXiv.org Artificial IntelligenceMar-19-2025

While Large Language Models (LLMs) excel at reasoning on text and Vision-Language Models (VLMs) are highly effective for visual perception, applying those models for visual instruction-based planning remains a widely open problem. In this paper, we introduce VIPER, a novel framework for multimodal instruction-based planning that integrates VLM-based perception with LLM-based reasoning. Our approach uses a modular pipeline where a frozen VLM generates textual descriptions of image observations, which are then processed by an LLM policy to predict actions based on the task goal. We fine-tune the reasoning module using behavioral cloning and reinforcement learning, improving our agent's decision-making capabilities. Experiments on the ALFWorld benchmark show that VIPER significantly outperforms state-of-the-art visual instruction-based planners while narrowing the gap with purely text-based oracles. By leveraging text as an intermediate representation, VIPER also enhances explainability, paving the way for a fine-grained analysis of perception and reasoning components.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.15108

Country: Europe > France (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Supra-Laplacian Encoding for Transformer on Dynamic Graphs

Karmim, Yannis, Lafon, Marc, S'niehotta, Raphael Fournier, Thome, Nicolas

arXiv.org Artificial IntelligenceNov-15-2024

Fully connected Graph Transformers (GT) have rapidly become prominent in the static graph community as an alternative to Message-Passing models, which suffer from a lack of expressivity, oversquashing, and under-reaching. However, in a dynamic context, by interconnecting all nodes at multiple snapshots with self-attention, GT loose both structural and temporal information. In this work, we introduce Supra-LAplacian encoding for spatio-temporal TransformErs (SLATE), a new spatio-temporal encoding to leverage the GT architecture while keeping spatio-temporal information. Specifically, we transform Discrete Time Dynamic Graphs into multi-layer graphs and take advantage of the spectral properties of their associated supra-Laplacian matrix. Our second contribution explicitly model nodes' pairwise relationships with a cross-attention mechanism, providing an accurate edge representation for dynamic link prediction. SLATE outperforms numerous state-of-the-art methods based on Message-Passing Graph Neural Networks combined with recurrent models (e.g LSTM), and Dynamic Graph Transformers, on 9 datasets. Code is available at: github.com/ykrmm/SLATE.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2409.17986

Country:

North America > United States (0.92)
Europe (0.68)

Genre:

Research Report > Experimental Study (0.93)
Research Report > Promising Solution (0.87)
Research Report > New Finding (0.67)

Industry: Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting

Aissi, Mohamed Salim, Romac, Clement, Carta, Thomas, Lamprier, Sylvain, Oudeyer, Pierre-Yves, Sigaud, Olivier, Soulier, Laure, Thome, Nicolas

arXiv.org Artificial IntelligenceOct-29-2024

Reinforcement learning (RL) is a promising approach for aligning large language models (LLMs) knowledge with sequential decision-making tasks. However, few studies have thoroughly investigated the impact on LLM agents capabilities of fine-tuning them with RL in a specific environment. In this paper, we propose a novel framework to analyze the sensitivity of LLMs to prompt formulations following RL training in a textual environment. Our findings reveal that the performance of LLMs degrades when faced with prompt formulations different from those used during the RL training phase. Besides, we analyze the source of this sensitivity by examining the model's internal representations and salient tokens. Finally, we propose to use a contrastive loss to mitigate this sensitivity and improve the robustness and generalization capabilities of LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.1992

Country: Europe > France (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Temporal receptive field in dynamic graph learning: A comprehensive analysis

Karmim, Yannis, Yang, Leshanshui, S'Niehotta, Raphaël Fournier, Chatelain, Clément, Adam, Sébastien, Thome, Nicolas

arXiv.org Artificial IntelligenceJul-17-2024

Dynamic link prediction is a critical task in the analysis of evolving networks, with applications ranging from recommender systems to economic exchanges. However, the concept of the temporal receptive field, which refers to the temporal context that models use for making predictions, has been largely overlooked and insufficiently analyzed in existing research. In this study, we present a comprehensive analysis of the temporal receptive field in dynamic graph learning. By examining multiple datasets and models, we formalize the role of temporal receptive field and highlight their crucial influence on predictive accuracy. Our results demonstrate that appropriately chosen temporal receptive field can significantly enhance model performance, while for some models, overly large windows may introduce noise and reduce accuracy. We conduct extensive benchmarking to validate our findings, ensuring that all experiments are fully reproducible.

information management, machine learning, temporal receptive field, (3 more...)

arXiv.org Artificial Intelligence

2407.1237

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

ITEM: Improving Training and Evaluation of Message-Passing based GNNs for top-k recommendation

Karmim, Yannis, Ramzi, Elias, Fournier-S'niehotta, Raphaël, Thome, Nicolas

arXiv.org Artificial IntelligenceJul-3-2024

Graph Neural Networks (GNNs), especially message-passing-based models, have become prominent in top-k recommendation tasks, outperforming matrix factorization models due to their ability to efficiently aggregate information from a broader context. Although GNNs are evaluated with ranking-based metrics, e.g. NDCG@k and Recall@k, they remain largely trained with proxy losses, e.g. the BPR loss. In this work we explore the use of ranking loss functions to directly optimize the evaluation metrics, an area not extensively investigated in the GNN community for collaborative filtering. We take advantage of smooth approximations of the rank to facilitate end-to-end training of GNNs and propose a Personalized PageRank-based negative sampling strategy tailored for ranking loss functions. Moreover, we extend the evaluation of GNN models for top-k recommendation tasks with an inductive user-centric protocol, providing a more accurate reflection of real-world applications. Our proposed method significantly outperforms the standard BPR loss and more advanced losses across four datasets and four recent GNN architectures while also exhibiting faster training.

artificial intelligence, deep learning, ndcg, (17 more...)

arXiv.org Artificial Intelligence

2407.07912

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning

Asri, Zakariae El, Sigaud, Olivier, Thome, Nicolas

arXiv.org Artificial IntelligenceJul-2-2024

Applying reinforcement learning (RL) to real-world applications requires addressing a trade-off between asymptotic performance, sample efficiency, and inference time. In this work, we demonstrate how to address this triple challenge by leveraging partial physical knowledge about the system dynamics. Our approach involves learning a physics-informed model to boost sample efficiency and generating imaginary trajectories from this model to learn a model-free policy and Q-function. Furthermore, we propose a hybrid planning strategy, combining the learned policy and Q-function with the learned model to enhance time efficiency in planning. Through practical demonstrations, we illustrate that our method improves the compromise between sample efficiency, time efficiency, and performance over state-of-the-art methods. Code is available at https://github.com/elasriz/PHIHP/

efficiency, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2407.02217

Country: Europe > France (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Energy Correction Model in the Feature Space for Out-of-Distribution Detection

Lafon, Marc, Rambour, Clément, Thome, Nicolas

arXiv.org Artificial IntelligenceMar-15-2024

In this work, we study the out-of-distribution (OOD) detection problem through the use of the feature space of a pre-trained deep classifier. We show that learning the density of in-distribution (ID) features with an energy-based models (EBM) leads to competitive detection results. However, we found that the non-mixing of MCMC sampling during the EBM's training undermines its detection performance. To overcome this an energy-based correction of a mixture of class-conditional Gaussian distributions. We obtains favorable results when compared to a strong baseline like the KNN detector on the CIFAR-10/CIFAR-100 OOD detection benchmarks.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2403.10403

Country:

North America > United States > New York (0.14)
North America > United States > Maryland (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Add feedback

Eagle: Large-Scale Learning of Turbulent Fluid Dynamics with Mesh Transformers

Janny, Steeven, Béneteau, Aurélien, Nadri, Madiha, Digne, Julie, Thome, Nicolas, Wolf, Christian

arXiv.org Artificial IntelligenceMar-17-2023

Estimating fluid dynamics is classically done through the simulation and integration of numerical models solving the Navier-Stokes equations, which is computationally complex and time-consuming even on high-end hardware. This is a notoriously hard problem to solve, which has recently been addressed with machine learning, in particular graph neural networks (GNN) and variants trained and evaluated on datasets of static objects in static scenes with fixed geometry. We attempt to go beyond existing work in complexity and introduce a new model, method and benchmark. We propose EAGLE, a large-scale dataset of 1.1 million 2D meshes resulting from simulations of unsteady fluid dynamics caused by a moving flow source interacting with nonlinear scene structure, comprised of 600 different scenes of three different types. To perform future forecasting of pressure and velocity on the challenging EAGLE dataset, we introduce a new mesh transformer. It leverages node clustering, graph pooling and global attention to learn long-range dependencies between spatially distant data points without needing a large number of iterations, as existing GNN methods do. We show that our transformer outperforms state-of-the-art performance on, both, existing synthetic and real datasets and on EAGLE. Finally, we highlight that our approach learns to attend to airflow, integrating complex information in a single iteration.

artificial intelligence, machine learning, turbulent fluid dynamic, (3 more...)

arXiv.org Artificial Intelligence

2302.10803

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Diverse Probabilistic Trajectory Forecasting with Admissibility Constraints

Calem, Laura, Ben-Younes, Hedi, Pérez, Patrick, Thome, Nicolas

arXiv.org Artificial IntelligenceFeb-7-2023

Predicting multiple trajectories for road users is important for automated driving systems: ego-vehicle motion planning indeed requires a clear view of the possible motions of the surrounding agents. However, the generative models used for multiple-trajectory forecasting suffer from a lack of diversity in their proposals. To avoid this form of collapse, we propose a novel method for structured prediction of diverse trajectories. To this end, we complement an underlying pretrained generative model with a diversity component, based on a determinantal point process (DPP). We balance and structure this diversity with the inclusion of knowledge-based quality constraints, independent from the underlying generative model. We combine these two novel components with a gating operation, ensuring that the predictions are both diverse and within the drivable area. We demonstrate on the nuScenes driving dataset the relevance of our compound approach, which yields significant improvements in the diversity and the quality of the generated trajectories.

artificial intelligence, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2302.03462

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.48)
Automobiles & Trucks (0.48)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.66)

Add feedback

Take One Gram of Neural Features, Get Enhanced Group Robustness

Roburin, Simon, Corbière, Charles, Puy, Gilles, Thome, Nicolas, Aubry, Matthieu, Marlet, Renaud, Pérez, Patrick

arXiv.org Artificial IntelligenceFeb-7-2023

Predictive performance of machine learning models trained with empirical risk minimization (ERM) can degrade considerably under distribution shifts. The presence of spurious correlations in training datasets leads ERM-trained models to display high loss when evaluated on minority groups not presenting such correlations. Extensive attempts have been made to develop methods improving worst-group robustness. However, they require group information for each training input or at least, a validation set with group labels to tune their hyperparameters, which may be expensive to get or unknown a priori. In this paper, we address the challenge of improving group robustness without group annotation during training or validation. To this end, we propose to partition the training dataset into groups based on Gram matrices of features extracted by an ``identification'' model and to apply robust optimization based on these pseudo-groups. In the realistic context where no group labels are available, our experiments show that our approach not only improves group robustness over ERM but also outperforms all recent baselines

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2208.12625

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback