AITopics | Labatut, Patrick

Collaborating Authors

Labatut, Patrick

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating Transformer Inference and Training with 2:4 Activation Sparsity

Haziza, Daniel, Chou, Timothy, Choudhary, Dhruv, Wehrstedt, Luca, Massa, Francisco, Yu, Jiecao, Jeong, Geonhwa, Rao, Supriya, Labatut, Patrick, Cai, Jesse

arXiv.org Artificial IntelligenceMar-20-2025

In this paper, we demonstrate how to leverage 2:4 sparsity, a popular hardwareaccelerated GPU sparsity pattern, to activations to accelerate large language model training and inference. Crucially we exploit the intrinsic sparsity found in Squared-ReLU activations to provide this acceleration with no accuracy loss. Our approach achieves up to 1.3x faster Feed Forward Network (FFNs) in both the forwards and backwards pass. This work highlights the potential for sparsity to play a key role in accelerating large language model training and inference. The rapid growth of Large Language Models (LLMs) in recent years has been driven by a corresponding surge in GPU FLOPs.

large language model, machine learning, sparsity, (16 more...)

arXiv.org Artificial Intelligence

2503.16672

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

Vo, Huy V., Khalidov, Vasil, Darcet, Timothée, Moutakanni, Théo, Smetanin, Nikita, Szafraniec, Marc, Touvron, Hugo, Couprie, Camille, Oquab, Maxime, Joulin, Armand, Jégou, Hervé, Labatut, Patrick, Bojanowski, Piotr

arXiv.org Artificial IntelligenceJun-28-2024

Self-supervised features are the cornerstone of modern machine learning systems. They are typically pre-trained on data collections whose construction and curation typically require extensive human effort. This manual process has some limitations similar to those encountered in supervised learning, e.g., the crowd-sourced selection of data is costly and time-consuming, preventing scaling the dataset size. In this work, we consider the problem of automatic curation of high-quality datasets for self-supervised pre-training. We posit that such datasets should be large, diverse and balanced, and propose a clustering-based approach for building ones satisfying all these criteria. Our method involves successive and hierarchical applications of $k$-means on a large and diverse data repository to obtain clusters that distribute uniformly among data concepts, followed by a hierarchical, balanced sampling step from these clusters. Extensive experiments on three different data domains including web-based images, satellite images and text show that features trained on our automatically curated datasets outperform those trained on uncurated data while being on par or better than ones trained on manually curated data. Code is available at https://github.com/facebookresearch/ssl-data-curation.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.15613

Country:

North America > United States (1.00)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Code Translation with Compiler Representations

Szafraniec, Marc, Roziere, Baptiste, Leather, Hugh, Charton, Francois, Labatut, Patrick, Synnaeve, Gabriel

arXiv.org Artificial IntelligenceApr-24-2023

In this paper, we leverage low-level compiler intermediate representations (IR) to improve code translation. Traditional transpilers rely on syntactic information and handcrafted rules, which limits their applicability and produces unnaturallooking code. Applying neural machine translation (NMT) approaches to code has successfully broadened the set of programs on which one can get a naturallooking translation. However, they treat the code as sequences of text tokens, and still do not differentiate well enough between similar pieces of code which have different semantics in different languages. The consequence is low quality translation, reducing the practicality of NMT, and stressing the need for an approach significantly increasing its accuracy. Here we propose to augment code translation with IRs, specifically LLVM IR, with results on the C++, Java, Rust, and Go languages. Our method improves upon the state of the art for unsupervised code translation, increasing the number of correct translations by 11% on average, and up to 79% for the Java Rust pair with greedy decoding. We extend previous test sets for code translation, by adding hundreds of Go and Rust functions. Additionally, we train models with high performance on the problem of IR decompilation, generating programming source code from IR, and study using IRs as pivot for translation. Automatic code translation allows to port old codebases to new frameworks, or high-level (but slow) languages to low-level (and fast) ones. They produce unidiomatic translations that prove hard to read for human programmers. This is a serious limitation: the translated code should be easy to read and understand, as it will eventually be maintained by human developers. In recent years, Neural Machine Translation (NMT) was proposed as an alternative to rule-based code translation (Roziere et al., 2020; Weisz et al., 2021; 2022).

artificial intelligence, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

2207.03578

Genre: Research Report (0.40)

Industry: Government (0.99)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback