AITopics | Schoenauer, Marc

Collaborating Authors

Schoenauer, Marc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evolutionary Pre-Prompt Optimization for Mathematical Reasoning

Videau, Mathurin, Leite, Alessandro, Schoenauer, Marc, Teytaud, Olivier

arXiv.org Artificial IntelligenceDec-5-2024

However, despite their size and complexity, these models still face challenges in multi-step reasoning, particularly in tasks that require arithmetic, logic, and/or mathematical reasoning [Cobbe et al. 2021; Rae et al. 2021]. To address this limitation, recent works have focused on enhancing the reasoning abilities of LLMs. A significant advancement in this direction is the chain-of-thought (CoT) prompting method [Wei et al. 2022b]. This approach involves guiding LLMs to articulate intermediate reasoning steps in a manner akin to human thought processes, leading to more accurate and interpretable solutions. This method has shown substantial improvements on complex tasks, including mathematics and commonsense reasoning [Lu et al. 2022b; Suzgun et al. 2022; Wei et al. 2022b]. The advancement of the CoT prompting has opened new pathways in the design of effective CoT prompts [Fu et al. 2022; Jiang et al. 2023; Kojima et al. 2022; Zhou et al. 2022].

large language model, machine learning, publication date, (17 more...)

arXiv.org Artificial Intelligence

2412.04291

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

Mixture of Experts in Image Classification: What's the Sweet Spot?

Videau, Mathurin, Leite, Alessandro, Schoenauer, Marc, Teytaud, Olivier

arXiv.org Artificial IntelligenceNov-27-2024

Mixture-of-Experts (MoE) models have shown promising potential for parameter-efficient scaling across various domains. However, the implementation in computer vision remains limited, and often requires large-scale datasets comprising billions of samples. In this study, we investigate the integration of MoE within computer vision models and explore various MoE configurations on open datasets. When introducing MoE layers in image classification, the best results are obtained for models with a moderate number of activated parameters per sample. However, such improvements gradually vanish when the number of parameters per sample increases.

architecture, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.18322

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)

Add feedback

Evolutionary Retrofitting

Videau, Mathurin, Zameshina, Mariia, Leite, Alessandro, Najman, Laurent, Schoenauer, Marc, Teytaud, Olivier

arXiv.org Artificial IntelligenceOct-15-2024

AfterLearnER (After Learning Evolutionary Retrofitting) consists in applying non-differentiable optimization, including evolutionary methods, to refine fully-trained machine learning models by optimizing a set of carefully chosen parameters or hyperparameters of the model, with respect to some actual, exact, and hence possibly non-differentiable error signal, performed on a subset of the standard validation set. The efficiency of AfterLearnER is demonstrated by tackling non-differentiable signals such as threshold-based criteria in depth sensing, the word error rate in speech re-synthesis, image quality in 3D generative adversarial networks (GANs), image generation via Latent Diffusion Models (LDM), the number of kills per life at Doom, computational accuracy or BLEU in code translation, and human appreciations in image synthesis. In some cases, this retrofitting is performed dynamically at inference time by taking into account user inputs. The advantages of AfterLearnER are its versatility (no gradient is needed), the possibility to use non-differentiable feedback including human evaluations, the limited overfitting, supported by a theoretical study and its anytime behavior. Last but not least, AfterLearnER requires only a minimal amount of feedback, i.e., a few dozens to a few hundreds of scalars, rather than the tens of thousands needed in most related published works. Compared to fine-tuning (typically using the same loss, and gradient-based optimization on a smaller but still big dataset at a fine grain), AfterLearnER uses a minimum amount of data on the real objective function without requiring differentiability.

evolutionary algorithm, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.1133

Country:

North America > United States (0.93)
Europe (0.68)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

NeurIPS 2024 ML4CFD Competition: Harnessing Machine Learning for Computational Fluid Dynamics in Airfoil Design

Yagoubi, Mouadh, Danan, David, Leyli-abadi, Milad, Brunet, Jean-Patrick, Mazari, Jocelyn Ahmed, Bonnet, Florent, gmati, maroua, Farjallah, Asma, Cinnella, Paola, Gallinari, Patrick, Schoenauer, Marc

arXiv.org Artificial IntelligenceJun-30-2024

The integration of machine learning (ML) techniques for addressing intricate physics problems is increasingly recognized as a promising avenue for expediting simulations. However, assessing ML-derived physical models poses a significant challenge for their adoption within industrial contexts. This competition is designed to promote the development of innovative ML approaches for tackling physical challenges, leveraging our recently introduced unified evaluation framework known as Learning Industrial Physical Simulations (LIPS). Building upon the preliminary edition held from November 2023 to March 2024, this iteration centers on a task fundamental to a well-established physical application: airfoil design simulation, utilizing our proposed AirfRANS dataset. The competition evaluates solutions based on various criteria encompassing ML accuracy, computational efficiency, Out-Of-Distribution performance, and adherence to physical principles. Notably, this competition represents a pioneering effort in exploring ML-driven surrogate methods aimed at optimizing the trade-off between computational efficiency and accuracy in physical simulations. Hosted on the Codabench platform, the competition offers online training and evaluation for all participating solutions.

artificial intelligence, machine learning, participant, (15 more...)

arXiv.org Artificial Intelligence

2407.01641

Country: Europe > France (0.29)

Genre: Research Report (0.50)

Industry:

Energy (0.93)
Education > Educational Setting > Online (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Learning Structural Causal Models through Deep Generative Models: Methods, Guarantees, and Challenges

Poinsot, Audrey, Leite, Alessandro, Chesneau, Nicolas, Sébag, Michèle, Schoenauer, Marc

arXiv.org Machine LearningMay-8-2024

This paper provides a comprehensive review of deep structural causal models (DSCMs), particularly focusing on their ability to answer counterfactual queries using observational data within known causal structures. It delves into the characteristics of DSCMs by analyzing the hypotheses, guarantees, and applications inherent to the underlying deep learning components and structural causal models, fostering a finer understanding of their capabilities and limitations in addressing different counterfactual queries. Furthermore, it highlights the challenges and open questions in the field of deep structural causal modeling. It sets the stages for researchers to identify future work directions and for practitioners to get an overview in order to find out the most appropriate methods for their needs.

artificial intelligence, conference, machine learning, (18 more...)

arXiv.org Machine Learning

2405.05025

Country: Europe (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

Add feedback

ML4PhySim : Machine Learning for Physical Simulations Challenge (The airfoil design)

Yagoubi, Mouadh, Leyli-Abadi, Milad, Danan, David, Brunet, Jean-Patrick, Mazari, Jocelyn Ahmed, Bonnet, Florent, Farjallah, Asma, Schoenauer, Marc, Gallinari, Patrick

arXiv.org Artificial IntelligenceMar-3-2024

The use of machine learning (ML) techniques to solve complex physical problems has been considered recently as a promising approach. However, the evaluation of such learned physical models remains an important issue for industrial use. The aim of this competition is to encourage the development of new ML techniques to solve physical problems using a unified evaluation framework proposed recently, called Learning Industrial Physical Simulations (LIPS). We propose learning a task representing a well-known physical use case: the airfoil design simulation, using a dataset called AirfRANS. The global score calculated for each submitted solution is based on three main categories of criteria covering different aspects, namely: ML-related, Out-Of-Distribution, and physical compliance criteria. To the best of our knowledge, this is the first competition addressing the use of ML-based surrogate approaches to improve the trade-off computational cost/accuracy of physical simulation.The competition is hosted by the Codabench platform with online training and evaluation of all submitted solutions.

artificial intelligence, evaluation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.01623

Country:

North America > United States (0.14)
Europe > France (0.14)
Europe > Slovakia (0.14)

Genre: Research Report (0.70)

Industry: Education > Educational Setting > Online (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Multi-Level GNN Preconditioner for Solving Large Scale Problems

Nastorg, Matthieu, Gratien, Jean-Marc, Faney, Thibault, Bucci, Michele Alessandro, Charpiat, Guillaume, Schoenauer, Marc

arXiv.org Artificial IntelligenceFeb-13-2024

Large-scale numerical simulations often come at the expense of daunting computations. High-Performance Computing has enhanced the process, but adapting legacy codes to leverage parallel GPU computations remains challenging. Meanwhile, Machine Learning models can harness GPU computations effectively but often struggle with generalization and accuracy. Graph Neural Networks (GNNs), in particular, are great for learning from unstructured data like meshes but are often limited to small-scale problems. Moreover, the capabilities of the trained model usually restrict the accuracy of the data-driven solution. To benefit from both worlds, this paper introduces a novel preconditioner integrating a GNN model within a multi-level Domain Decomposition framework. The proposed GNN-based preconditioner is used to enhance the efficiency of a Krylov method, resulting in a hybrid solver that can converge with any desired level of accuracy. The efficiency of the Krylov method greatly benefits from the GNN preconditioner, which is adaptable to meshes of any size and shape, is executed on GPUs, and features a multi-level approach to enforce the scalability of the entire process. Several experiments are conducted to validate the numerical behavior of the hybrid solver, and an in-depth analysis of its performance is proposed to assess its competitiveness against a C++ legacy solver.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.08296

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Interpretable learning of effective dynamics for multiscale systems

Menier, Emmanuel, Kaltenbach, Sebastian, Yagoubi, Mouadh, Schoenauer, Marc, Koumoutsakos, Petros

arXiv.org Machine LearningSep-11-2023

The modeling and simulation of high-dimensional multiscale systems is a critical challenge across all areas of science and engineering. It is broadly believed that even with today's computer advances resolving all spatiotemporal scales described by the governing equations remains a remote target. This realization has prompted intense efforts to develop model order reduction techniques. In recent years, techniques based on deep recurrent neural networks have produced promising results for the modeling and simulation of complex spatiotemporal systems and offer large flexibility in model development as they can incorporate experimental and computational data. However, neural networks lack interpretability, which limits their utility and generalizability across complex systems. Here we propose a novel framework of Interpretable Learning Effective Dynamics (iLED) that offers comparable accuracy to state-of-the-art recurrent neural network-based approaches while providing the added benefit of interpretability. The iLED framework is motivated by Mori-Zwanzig and Koopman operator theory, which justifies the choice of the specific architecture. We demonstrate the effectiveness of the proposed framework in simulations of three benchmark multiscale systems. Our results show that the iLED framework can generate accurate predictions and obtain interpretable dynamics, making it a promising approach for solving high-dimensional multiscale systems.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2309.05812

Country:

Europe (0.68)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Meta-Learning for Airflow Simulations with Graph Neural Networks

Liu, Wenzhuo, Yagoubi, Mouadh, Schoenauer, Marc

arXiv.org Artificial IntelligenceJun-18-2023

The field of numerical simulation is of significant importance for the design and management of real-world systems, with partial differential equations (PDEs) being a commonly used mathematical modeling tool. However, solving PDEs remains still a challenge, as commonly used traditional numerical solvers often require high computational costs. As a result, data-driven methods leveraging machine learning (more particularly Deep Learning) algorithms have been increasingly proposed to learn models that can predict solutions to complex PDEs, such as those arising in computational fluid dynamics (CFD). However, these methods are known to suffer from poor generalization performance on out-of-distribution (OoD) samples, highlighting the need for more efficient approaches. To this end, we present a meta-learning approach to enhance the performance of learned models on OoD samples. Specifically, we set the airflow simulation in CFD over various airfoils as a meta-learning problem, where each set of examples defined on a single airfoil shape is treated as a separate task. Through the use of model-agnostic meta-learning (MAML), we learn a meta-learner capable of adapting to new tasks, i.e., previously unseen airfoil shapes, using only a small amount of task-specific data. We experimentally demonstrate the efficiency of the proposed approach for improving the OoD generalization performance of learned models while maintaining efficiency.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.10624

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CD-ROM: Complemented Deep-Reduced Order Model

Menier, Emmanuel, Bucci, Michele Alessandro, Yagoubi, Mouadh, Mathelin, Lionel, Schoenauer, Marc

arXiv.org Artificial IntelligenceMay-2-2023

Model order reduction through the POD-Galerkin method can lead to dramatic gains in terms of computational efficiency in solving physical problems. However, the applicability of the method to non linear high-dimensional dynamical systems such as the Navier-Stokes equations has been shown to be limited, producing inaccurate and sometimes unstable models. This paper proposes a deep learning based closure modeling approach for classical POD-Galerkin reduced order models (ROM). The proposed approach is theoretically grounded, using neural networks to approximate well studied operators. In contrast with most previous works, the present CD-ROM approach is based on an interpretable continuous memory formulation, derived from simple hypotheses on the behavior of partially observed dynamical systems. The final corrected models can hence be simulated using most classical time stepping schemes. The capabilities of the CD-ROM approach are demonstrated on two classical examples from Computational Fluid Dynamics, as well as a parametric case, the Kuramoto-Sivashinsky equation.

artificial intelligence, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.cma.2023.115985

2202.10746

Country: Europe (0.93)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Energy > Oil & Gas > Upstream (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback