AITopics | Fromm, Michael

Collaborating Authors

Fromm, Michael

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Multilingual LLM Evaluation for European Languages

Thellmann, Klaudia, Stadler, Bernhard, Fromm, Michael, Buschhoff, Jasper Schulze, Jude, Alex, Barth, Fabio, Leveling, Johannes, Flores-Herr, Nicolas, Köhler, Joachim, Jäkel, René, Ali, Mehdi

arXiv.org Artificial IntelligenceOct-17-2024

The rise of Large Language Models (LLMs) has revolutionized natural language processing across numerous languages and tasks. However, evaluating LLM performance in a consistent and meaningful way across multiple European languages remains challenging, especially due to the scarcity of language-parallel multilingual benchmarks. We introduce a multilingual evaluation approach tailored for European languages. We employ translated versions of five widely-used benchmarks to assess the capabilities of 40 LLMs across 21 European languages. Our contributions include examining the effectiveness of translated benchmarks, assessing the impact of different translation services, and offering a multilingual evaluation framework for LLMs that includes newly created datasets: EU20-MMLU, EU20-HellaSwag, EU20-ARC, EU20-TruthfulQA, and EU20-GSM8K. The benchmarks and results are made publicly available to encourage further research in multilingual LLM evaluation.

large language model, machine learning, mistral-7b-instruct-v0, (22 more...)

arXiv.org Artificial Intelligence

2410.08928

Country:

Europe > Germany (0.46)
Asia > Middle East > Republic of Türkiye (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Industry:

Health & Medicine (0.45)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs

Ali, Mehdi, Fromm, Michael, Thellmann, Klaudia, Ebert, Jan, Weber, Alexander Arno, Rutmann, Richard, Jain, Charvi, Lübbering, Max, Steinigen, Daniel, Leveling, Johannes, Klug, Katrin, Buschhoff, Jasper Schulze, Jurkschat, Lena, Abdelwahab, Hammam, Stein, Benny Jörg, Sylla, Karl-Heinz, Denisov, Pavel, Brandizzi, Nicolo', Saleem, Qasid, Bhowmick, Anirban, Helmer, Lennard, John, Chelsea, Suarez, Pedro Ortiz, Ostendorff, Malte, Jude, Alex, Manjunath, Lalith, Weinbach, Samuel, Penke, Carolin, Filatov, Oleg, Asaadi, Shima, Barth, Fabio, Sifa, Rafet, Küch, Fabian, Herten, Andreas, Jäkel, René, Rehm, Georg, Kesselheim, Stefan, Köhler, Joachim, Flores-Herr, Nicolas

arXiv.org Artificial IntelligenceOct-15-2024

We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenizer, our models address the limitations of existing LLMs that predominantly focus on English or a few high-resource languages. We detail the models' development principles, i.e., data composition, tokenizer optimization, and training methodologies. The models demonstrate competitive performance across multilingual benchmarks, as evidenced by their performance on European versions of ARC, HellaSwag, MMLU, and TruthfulQA.

large language model, machine learning, meta-llama-3, (15 more...)

arXiv.org Artificial Intelligence

2410.0373

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > Europe Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Data Processing for the OpenGPT-X Model Family

Brandizzi, Nicolo', Abdelwahab, Hammam, Bhowmick, Anirban, Helmer, Lennard, Stein, Benny Jörg, Denisov, Pavel, Saleem, Qasid, Fromm, Michael, Ali, Mehdi, Rutmann, Richard, Naderi, Farzad, Agy, Mohamad Saif, Schwirjow, Alexander, Küch, Fabian, Hahn, Luzian, Ostendorff, Malte, Suarez, Pedro Ortiz, Rehm, Georg, Wegener, Dennis, Flores-Herr, Nicolas, Köhler, Joachim, Leveling, Johannes

arXiv.org Artificial IntelligenceOct-11-2024

This paper presents a comprehensive overview of the data preparation pipeline developed for the OpenGPT-X project, a large-scale initiative aimed at creating open and high-performance multilingual large language models (LLMs). The project goal is to deliver models that cover all major European languages, with a particular focus on real-world applications within the European Union. We explain all data processing steps, starting with the data selection and requirement definition to the preparation of the final datasets for model training. We distinguish between curated data and web data, as each of these categories is handled by distinct pipelines, with curated data undergoing minimal filtering and web data requiring extensive filtering and deduplication. This distinction guided the development of specialized algorithmic solutions for both pipelines. In addition to describing the processing methodologies, we provide an in-depth analysis of the datasets, increasing transparency and alignment with European data regulations. Finally, we share key insights and challenges faced during the project, offering recommendations for future endeavors in large-scale multilingual data preparation for LLMs.

data mining, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.088

Country:

Europe (0.87)
North America > United States (0.46)
North America > Mexico > Mexico City (0.14)

Genre:

Overview (0.86)
Research Report > New Finding (0.46)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.93)
Information Technology > Software (0.71)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Do Multilingual Large Language Models Mitigate Stereotype Bias?

Nie, Shangrui, Fromm, Michael, Welch, Charles, Görge, Rebekka, Karimi, Akbar, Plepi, Joan, Mowmita, Nazia Afsan, Flores-Herr, Nicolas, Ali, Mehdi, Flek, Lucie

arXiv.org Artificial IntelligenceJul-9-2024

While preliminary findings indicate that multilingual LLMs exhibit reduced bias compared to monolingual ones, a comprehensive understanding of the effect of multilingual training on bias mitigation, is lacking. This study addresses this gap by systematically training six LLMs of identical size (2.6B parameters) and architecture: five monolingual models (English, German, French, Italian, and Spanish) and one multilingual model trained on an equal distribution of data across these languages, all using publicly available data. To ensure robust evaluation, standard bias benchmarks were automatically translated into the five target languages and verified for both translation quality and bias preservation by human annotators. Our results consistently demonstrate that multilingual training effectively mitigates bias. Moreover, we observe that multilingual models achieve not only lower bias but also superior prediction accuracy when compared to monolingual models with the same amount of training data, model architecture, and size.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2407.0574

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Law (1.00)
Government > Regional Government > Europe Government (0.46)
Education > Educational Setting > K-12 Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?

Weber, Alexander Arno, Thellmann, Klaudia, Ebert, Jan, Flores-Herr, Nicolas, Lehmann, Jens, Fromm, Michael, Ali, Mehdi

arXiv.org Artificial IntelligenceFeb-21-2024

The adaption of multilingual pre-trained Large Language Models (LLMs) into eloquent and helpful assistants is essential to facilitate their use across different language regions. In that spirit, we are the first to conduct an extensive study of the performance of multilingual models on parallel, multi-turn instruction-tuning benchmarks across a selection of the most-spoken Indo-European languages. We systematically examine the effects of language and instruction dataset size on a mid-sized, multilingual LLM by instruction-tuning it on parallel instruction-tuning datasets. Our results demonstrate that instruction-tuning on parallel instead of monolingual corpora benefits cross-lingual instruction following capabilities by up to 4.6%. Furthermore, we show that the Superficial Alignment Hypothesis does not hold in general, as the investigated multilingual 7B parameter model presents a counter-example requiring large-scale instruction-tuning datasets. Finally, we conduct a human annotation study to understand the alignment between human-based and GPT-4-based evaluation within multilingual chat scenarios.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.13703

Country:

Europe > Germany (0.28)
North America > United States > Hawaii (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Construction & Engineering (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tokenizer Choice For LLM Training: Negligible or Crucial?

Ali, Mehdi, Fromm, Michael, Thellmann, Klaudia, Rutmann, Richard, Lübbering, Max, Leveling, Johannes, Klug, Katrin, Ebert, Jan, Doll, Niclas, Buschhoff, Jasper Schulze, Jain, Charvi, Weber, Alexander Arno, Jurkschat, Lena, Abdelwahab, Hammam, John, Chelsea, Suarez, Pedro Ortiz, Ostendorff, Malte, Weinbach, Samuel, Sifa, Rafet, Kesselheim, Stefan, Flores-Herr, Nicolas

arXiv.org Artificial IntelligenceOct-18-2023

The recent success of LLMs has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot. Shedding light on this underexplored area, we conduct a comprehensive study on the influence of tokenizer choice on LLM downstream performance by training 24 mono- and multilingual LLMs at a 2.6B parameter scale, ablating different tokenizer algorithms and parameterizations. Our studies highlight that the tokenizer choice can significantly impact the model's downstream performance, training and inference costs. In particular, we find that the common tokenizer evaluation metrics fertility and parity are not always predictive of model downstream performance, rendering these metrics a questionable proxy for the model's downstream performance. Furthermore, we show that multilingual tokenizers trained on the five most frequent European languages require vocabulary size increases of factor three in comparison to English. While English-only tokenizers have been applied to the training of multi-lingual LLMs, we find that this approach results in a severe downstream performance degradation and additional training costs of up to 68%, due to an inefficient tokenization vocabulary.

large language model, natural language, tokenizer choice, (2 more...)

arXiv.org Artificial Intelligence

2310.08754

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Conifer Seedling Detection in UAV-Imagery with RGB-Depth Information

Jooste, Jason, Fromm, Michael, Schubert, Matthias

arXiv.org Artificial IntelligenceNov-22-2021

Monitoring of reforestation is currently being considerably streamlined through the use of drones and image recognition algorithms, which have already proven to be effective on colour imagery. In addition to colour imagery, elevation data is often also available. The primary aim of this work was to improve the performance of the faster-RCNN object detection algorithm by integrating this height information, which showed itself to notably improve performance. Interestingly, the structure of the network played a key role, with direct addition of the height information as a fourth image channel showing no improvement, while integration after the backbone network and before the region proposal network led to marked improvements. This effect persisted with very long training regimes. Increasing the resolution of this height information also showed little effect.

artificial intelligence, machine learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

2111.11388

Country: North America > Canada (0.48)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.46)

Add feedback

Active Learning for Argument Strength Estimation

Kees, Nataliia, Fromm, Michael, Faerman, Evgeniy, Seidl, Thomas

arXiv.org Artificial IntelligenceSep-23-2021

High-quality arguments are an essential part of decision-making. Automatically predicting the quality of an argument is a complex task that recently got much attention in argument mining. However, the annotation effort for this task is exceptionally high. Therefore, we test uncertainty-based active learning (AL) methods on two popular argument-strength data sets to estimate whether sample-efficient learning can be enabled. Our extensive empirical evaluation shows that uncertainty-based acquisition functions can not surpass the accuracy reached with the random acquisition on these data sets.

acquisition, educational method, mentoring method, (29 more...)

arXiv.org Artificial Intelligence

2109.11319

Country:

Europe > Germany (0.46)
North America > United States > New York (0.14)
North America > United States > Texas (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Argument Mining Driven Analysis of Peer-Reviews

Fromm, Michael, Faerman, Evgeniy, Berrendorf, Max, Bhargava, Siddharth, Qi, Ruoxia, Zhang, Yao, Dennert, Lukas, Selle, Sophia, Mao, Yang, Seidl, Thomas

arXiv.org Artificial IntelligenceDec-10-2020

Peer reviewing is a central process in modern research and essential for ensuring high quality and reliability of published work. At the same time, it is a time-consuming process and increasing interest in emerging fields often results in a high review workload, especially for senior researchers in this area. How to cope with this problem is an open question and it is vividly discussed across all major conferences. In this work, we propose an Argument Mining based approach for the assistance of editors, meta-reviewers, and reviewers. We demonstrate that the decision process in the field of scientific publications is driven by arguments and automatic argument identification is helpful in various use-cases. One of our findings is that arguments used in the peer-review process differ from arguments in other domains making the transfer of pre-trained models difficult. Therefore, we provide the community with a new peer-review dataset from different computer science conferences with annotated arguments. In our extensive empirical evaluation, we show that Argument Mining can be used to efficiently extract the most relevant parts from reviews, which are paramount for the publication decision. The process remains interpretable since the extracted arguments can be highlighted in a review without detaching them from their context.

argument, neural network, survey article, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.5281/zenodo.4314390

2012.07743

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Using Supervised Learning to Classify Metadata of Research Data by Discipline of Research

Weber, Tobias, Kranzlmüller, Dieter, Fromm, Michael, de Sousa, Nelson Tavares

arXiv.org Machine LearningOct-16-2019

Automated classification of metadata of research data by their discipline(s) of research can be used in scientometric research, by repository service providers, and in the context of research data aggregation services. Openly available metadata of the DataCite index for research data were used to compile a large training and evaluation set comprised of 609,524 records, which is published alongside this paper. These data allow to reproducibly assess classification approaches, such as tree-based models and neural networks. According to our experiments with 20 base classes (multi-label classification), multi-layer perceptron models perform best with a f1-macro score of 0.760 closely followed by Long Short-Term Memory models (f1-macro score of 0.755). A possible application of the trained classification models is the quantitative analysis of trends towards interdisciplinarity of digital scholarly output or the characterization of growth patterns of research data, stratified by discipline of research. Both applications perform at scale with the proposed models which are available for re-use.

deep learning, neural network, research data, (19 more...)

arXiv.org Machine Learning

1910.09313

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback