AITopics | Besnier, Victor

Collaborating Authors

Besnier, Victor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VaViM and VaVAM: Autonomous Driving through Video Generative Modeling

Bartoccioni, Florent, Ramzi, Elias, Besnier, Victor, Venkataramanan, Shashanka, Vu, Tuan-Hung, Xu, Yihong, Chambon, Loick, Gidaris, Spyros, Odabas, Serkan, Hurych, David, Marlet, Renaud, Boulch, Alexandre, Chen, Mickael, Zablocki, Éloi, Bursuc, Andrei, Valle, Eduardo, Cord, Matthieu

arXiv.org Artificial IntelligenceFeb-21-2025

We explore the potential of large-scale generative video models for autonomous driving, introducing an open-source auto-regressive video model (VaViM) and its companion video-action model (VaVAM) to investigate how video pre-training transfers to real-world driving. VaViM is a simple auto-regressive video model that predicts frames using spatio-temporal token sequences. We show that it captures the semantics and dynamics of driving scenes. VaVAM, the video-action model, leverages the learned representations of VaViM to generate driving trajectories through imitation learning. Together, the models form a complete perception-to-action pipeline. We evaluate our models in open- and closed-loop driving scenarios, revealing that video-based pre-training holds promise for autonomous driving. Key insights include the semantic richness of the learned representations, the benefits of scaling for video synthesis, and the complex relationship between model size, data, and safety metrics in closed-loop evaluations. We release code and model weights at https://github.com/valeoai/VideoActionModel

large language model, machine learning, trajectory, (20 more...)

arXiv.org Artificial Intelligence

2502.15672

Country: Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.91)
Information Technology > Robotics & Automation (0.82)
Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Don't drop your samples! Coherence-aware training benefits Conditional diffusion

Dufour, Nicolas, Besnier, Victor, Kalogeiton, Vicky, Picard, David

arXiv.org Artificial IntelligenceMay-30-2024

Conditional diffusion models are powerful generative models that can leverage various types of conditional information, such as class labels, segmentation masks, or text captions. However, in many real-world scenarios, conditional information may be noisy or unreliable due to human annotation errors or weak alignment. In this paper, we propose the Coherence-Aware Diffusion (CAD), a novel method that integrates coherence in conditional information into diffusion models, allowing them to learn from noisy annotations without discarding data. We assume that each data point has an associated coherence score that reflects the quality of the conditional information. We then condition the diffusion model on both the conditional information and the coherence score. In this way, the model learns to ignore or discount the conditioning when the coherence is low. We show that CAD is theoretically sound and empirically effective on various conditional generation tasks. Moreover, we show that leveraging coherence generates realistic and diverse samples that respect conditional information better than models trained on cleaned datasets where samples with low coherence have been discarded.

coherence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.20324

Country: Europe (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Media (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

Supervised Anomaly Detection for Complex Industrial Images

Baitieva, Aimira, Hurych, David, Besnier, Victor, Bernard, Olivier

arXiv.org Artificial IntelligenceMay-11-2024

Automating visual inspection in industrial production lines is essential for increasing product quality across various industries. Anomaly detection (AD) methods serve as robust tools for this purpose. However, existing public datasets primarily consist of images without anomalies, limiting the practical application of AD methods in production settings. To address this challenge, we present (1) the Valeo Anomaly Dataset (VAD), a novel real-world industrial dataset comprising 5000 images, including 2000 instances of challenging real defects across more than 20 subclasses. Acknowledging that traditional AD methods struggle with this dataset, we introduce (2) Segmentation-based Anomaly Detector (SegAD). First, SegAD leverages anomaly maps as well as segmentation maps to compute local statistics. Next, SegAD uses these statistics and an optional supervised classifier score as input features for a Boosted Random Forest (BRF) classifier, yielding the final anomaly score. Our SegAD achieves state-of-the-art performance on both VAD (+2.1% AUROC) and the VisA dataset (+0.4% AUROC). The code and the models are publicly available.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.04953

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback