AITopics | Brandizzi, Nicolo'

Collaborating Authors

Brandizzi, Nicolo'

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs

Ali, Mehdi, Fromm, Michael, Thellmann, Klaudia, Ebert, Jan, Weber, Alexander Arno, Rutmann, Richard, Jain, Charvi, Lübbering, Max, Steinigen, Daniel, Leveling, Johannes, Klug, Katrin, Buschhoff, Jasper Schulze, Jurkschat, Lena, Abdelwahab, Hammam, Stein, Benny Jörg, Sylla, Karl-Heinz, Denisov, Pavel, Brandizzi, Nicolo', Saleem, Qasid, Bhowmick, Anirban, Helmer, Lennard, John, Chelsea, Suarez, Pedro Ortiz, Ostendorff, Malte, Jude, Alex, Manjunath, Lalith, Weinbach, Samuel, Penke, Carolin, Filatov, Oleg, Asaadi, Shima, Barth, Fabio, Sifa, Rafet, Küch, Fabian, Herten, Andreas, Jäkel, René, Rehm, Georg, Kesselheim, Stefan, Köhler, Joachim, Flores-Herr, Nicolas

arXiv.org Artificial IntelligenceOct-15-2024

We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenizer, our models address the limitations of existing LLMs that predominantly focus on English or a few high-resource languages. We detail the models' development principles, i.e., data composition, tokenizer optimization, and training methodologies. The models demonstrate competitive performance across multilingual benchmarks, as evidenced by their performance on European versions of ARC, HellaSwag, MMLU, and TruthfulQA.

large language model, machine learning, meta-llama-3, (15 more...)

arXiv.org Artificial Intelligence

2410.0373

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > Europe Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Data Processing for the OpenGPT-X Model Family

Brandizzi, Nicolo', Abdelwahab, Hammam, Bhowmick, Anirban, Helmer, Lennard, Stein, Benny Jörg, Denisov, Pavel, Saleem, Qasid, Fromm, Michael, Ali, Mehdi, Rutmann, Richard, Naderi, Farzad, Agy, Mohamad Saif, Schwirjow, Alexander, Küch, Fabian, Hahn, Luzian, Ostendorff, Malte, Suarez, Pedro Ortiz, Rehm, Georg, Wegener, Dennis, Flores-Herr, Nicolas, Köhler, Joachim, Leveling, Johannes

arXiv.org Artificial IntelligenceOct-11-2024

This paper presents a comprehensive overview of the data preparation pipeline developed for the OpenGPT-X project, a large-scale initiative aimed at creating open and high-performance multilingual large language models (LLMs). The project goal is to deliver models that cover all major European languages, with a particular focus on real-world applications within the European Union. We explain all data processing steps, starting with the data selection and requirement definition to the preparation of the final datasets for model training. We distinguish between curated data and web data, as each of these categories is handled by distinct pipelines, with curated data undergoing minimal filtering and web data requiring extensive filtering and deduplication. This distinction guided the development of specialized algorithmic solutions for both pipelines. In addition to describing the processing methodologies, we provide an in-depth analysis of the datasets, increasing transparency and alignment with European data regulations. Finally, we share key insights and challenges faced during the project, offering recommendations for future endeavors in large-scale multilingual data preparation for LLMs.

data mining, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.088

Country:

Europe (0.87)
North America > United States (0.46)
North America > Mexico > Mexico City (0.14)

Genre:

Overview (0.86)
Research Report > New Finding (0.46)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.93)
Information Technology > Software (0.71)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Towards More Human-like AI Communication: A Review of Emergent Communication Research

Brandizzi, Nicolo'

arXiv.org Artificial IntelligenceAug-1-2023

In the initial phase of AI research following the second AI winter, the focus was on identifying new areas where AI could outperform humans, with famous examples including chess [Silver et al., 2018], Go [Silver et al., 2016], and Starcraft [Vinyals et al., 2019]. While this was a limited application to games, it set the tone for research to prioritize building AI agents with superhuman capabilities. However, over the last decade, the research community has witnessed a shift towards a human-centric approach that aims to leverage AI to aid humans in everyday tasks and relieve them of repetitive duties [Xu, 2019, Riedl, 2019, Shneiderman, 2021]. The interaction between humans and machines is a crucial aspect of human-centric AI [Mikolov et al., 2016], and it should take place in domains where humans are already familiar and require little to no training. Therefore, applications that involve niche practices, such as coding and mathematics, should be avoided in favor of language-based applications. In particular, human-machine communication should be grounded in natural language, which presents the challenge of teaching artificial agents to communicate in multiple languages. Recent advances in natural language processing (NLP) have led to the emergence of the transformer architecture [Vaswani et al., 2017], which has become the preferred approach for language-based applications, as exemplified by Language Models (LMs) such as GPT3 [Brown et al., 2020], LLaMA [Touvron et al., 2023], and Lamda [Thoppilan et al., 2022]. One of the challenges for language model architectures is their focus on predicting the next word in a sentence rather than comprehending the broader context and purpose of language usage. While humans use language as a tool for coordination and communication to thrive in a shared environment, artificial intelligence may struggle to understand the subtleties and complexities of language fully.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2023.3339656

2308.02541

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > Los Angeles County > Long Beach (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.92)
Leisure & Entertainment > Games > Computer Games (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)

Add feedback

Speaking the Language of Your Listener: Audience-Aware Adaptation via Plug-and-Play Theory of Mind

Takmaz, Ece, Brandizzi, Nicolo', Giulianelli, Mario, Pezzelle, Sandro, Fernández, Raquel

arXiv.org Artificial IntelligenceMay-31-2023

Dialogue participants may have varying levels of knowledge about the topic under discussion. In such cases, it is essential for speakers to adapt their utterances by taking their audience into account. Yet, it is an open question how such adaptation can be modelled in computational agents. In this paper, we model a visually grounded referential game between a knowledgeable speaker and a listener with more limited visual and linguistic experience. Inspired by psycholinguistic theories, we endow our speaker with the ability to adapt its referring expressions via a simulation module that monitors the effectiveness of planned utterances from the listener's perspective. We propose an adaptation mechanism building on plug-and-play approaches to controlled language generation, where utterance generation is steered on the fly by the simulator without finetuning the speaker's underlying language model. Our results and analyses show that our approach is effective: the speaker's utterances become closer to the listener's domain of expertise, which leads to higher communicative success.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.19933

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback