AITopics | Charpentier, Lucas Georges Gabriel

Collaborating Authors

Charpentier, Lucas Georges Gabriel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GPT or BERT: why not both?

Charpentier, Lucas Georges Gabriel, Samuel, David

arXiv.org Artificial IntelligenceDec-29-2024

We present a simple way to merge masked language modeling with causal language modeling. This hybrid training objective results in a model that combines the strengths of both modeling paradigms within a single transformer stack: GPT-BERT can be transparently used like any standard causal or masked language model. We test the pretraining process that enables this flexible behavior on the BabyLM Challenge 2024. The results show that the hybrid pretraining outperforms masked-only or causal-only models. We openly release the models, training corpora and code.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.24159

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Small Languages, Big Models: A Study of Continual Training on Languages of Norway

Samuel, David, Mikhailov, Vladislav, Velldal, Erik, Øvrelid, Lilja, Charpentier, Lucas Georges Gabriel, Kutuzov, Andrey

arXiv.org Artificial IntelligenceDec-9-2024

This method vast amounts of data, posing a challenge enables us to train an 11.4B parameter model that for less widely spoken languages like Norwegian achieves state-of-the-art performance across Norwegian and even more so for truly lowresource language tasks while maintaining strong languages like Sámi. To address capabilities in Northern Sámi. The three main research this issue, we present a novel three-stage contributions of this paper can be summarized continual training approach. We also experiment as follows: with combining causal and masked 1. Novel training method for data-constrained language modeling to get more flexible language models We propose a three-stage models. Based on our findings, we train, training method for efficient adaptation of existing evaluate, and openly release a new large language models to lower-resource languages.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.06484

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.88)

Add feedback

Compositional Generalization with Grounded Language Models

Wold, Sondre, Simon, Étienne, Charpentier, Lucas Georges Gabriel, Kostylev, Egor V., Velldal, Erik, Øvrelid, Lilja

arXiv.org Artificial IntelligenceJun-7-2024

Grounded language models use external sources of information, such as knowledge graphs, to meet some of the general challenges associated with pre-training. By extending previous work on compositional generalization in semantic parsing, we allow for a controlled evaluation of the degree to which these models learn and generalize from patterns in knowledge graphs. We develop a procedure for generating natural language questions paired with knowledge graphs that targets different aspects of compositionality and further avoids grounding the language models in information already encoded implicitly in their weights. We evaluate existing methods for combining language models with knowledge graphs and find them to struggle with generalization to sequences of unseen lengths and to novel combinations of seen base components. While our experimental results provide some insight into the expressive power of these models, we hope our work and released datasets motivate future research on how to better combine language models with structured knowledge representations.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.04989

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.54)

Add feedback

More Room for Language: Investigating the Effect of Retrieval on Language Models

Samuel, David, Charpentier, Lucas Georges Gabriel, Wold, Sondre

arXiv.org Artificial IntelligenceApr-16-2024

Retrieval-augmented language models pose a promising alternative to standard language modeling. During pretraining, these models search in a corpus of documents for contextually relevant information that could aid the language modeling objective. We introduce an'ideal retrieval' methodology to study these models in a fully controllable setting. We conduct an extensive evaluation to examine how retrieval augmentation affects the behavior of the underlying language model. Among other things, we observe that these models: Figure 1: The aggregated absolute differences from i) save substantially less world knowledge in the baseline across three categories of benchmarks, the their weights, ii) are better at understanding models exhibit consistent differences for each category.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2404.10939

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
(3 more...)

Add feedback

Not all layers are equally as important: Every Layer Counts BERT

Charpentier, Lucas Georges Gabriel, Samuel, David

arXiv.org Artificial IntelligenceNov-7-2023

This paper introduces a novel modification of the transformer architecture, tailored for the data-efficient pretraining of language models. This aspect is evaluated by participating in the BabyLM challenge, where our solution won both the strict and strict-small tracks. Our approach allows each transformer layer to select which outputs of previous layers to process. The empirical results verify the potential of this simple modification and show that not all layers are equally as important.

artificial intelligence, layer, natural language, (1 more...)

arXiv.org Artificial Intelligence

2311.02265

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.53)

Add feedback

BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer

Charpentier, Lucas Georges Gabriel, Wold, Sondre, Samuel, David, Rønningstad, Egil

arXiv.org Artificial IntelligenceApr-19-2023

Retrieval-based language models are increasingly employed in question-answering tasks. These models search in a corpus of documents for relevant information instead of having all factual knowledge stored in its parameters, thereby enhancing efficiency, transparency, and adaptability. We develop the first Norwegian retrieval-based model by adapting the REALM framework and evaluating it on various tasks. After training, we also separate the language model, which we call the reader, from the retriever components, and show that this can be fine-tuned on a range of downstream tasks. Results show that retrieval augmented language modeling improves the reader's performance on extractive question-answering, suggesting that this type of training improves language models' general ability to use context and that this does not happen at the expense of other abilities such as part-of-speech tagging, dependency parsing, named entity recognition, and lemmatization. Code, trained models, and data are made publicly available.

artificial intelligence, corpus, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.09649

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.87)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.55)

Add feedback