AITopics | Schottmann, Florian

Collaborating Authors

Schottmann, Florian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A comparison of translation performance between DeepL and Supertext

Flückiger, Alex, Amrhein, Chantal, Graf, Tim, Odermatt, Frédéric, Pömsl, Martin, Schläpfer, Philippe, Schottmann, Florian, Läubli, Samuel

arXiv.org Artificial IntelligenceFeb-11-2025

As strong machine translation (MT) systems are increasingly based on large language models (LLMs), reliable quality benchmarking requires methods that capture their ability to leverage extended context. This study compares two commercial MT systems -- DeepL and Supertext -- by assessing their performance on unsegmented texts. We evaluate translation quality across four language directions with professional translators assessing segments with full document-level context. While segment-level assessments indicate no strong preference between the systems in most cases, document-level analysis reveals a preference for Supertext in three out of four language directions, suggesting superior consistency across longer texts. We advocate for more context-sensitive evaluation methodologies to ensure that MT quality assessments reflect real-world usability. We release all evaluation data and scripts for further analysis and reproduction at https://github.com/supertext/evaluation_deepl_supertext.

artificial intelligence, natural language, translation, (15 more...)

arXiv.org Artificial Intelligence

2502.02577

Country:

Europe (0.95)
North America > United States (0.69)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?

Kew, Tannon, Schottmann, Florian, Sennrich, Rico

arXiv.org Artificial IntelligenceDec-19-2023

The vast majority of today's large language models are English-centric, having been pretrained predominantly on English text. Yet, in order to meet user expectations, models need to be able to respond appropriately in multiple languages once deployed in downstream applications. Given limited exposure to other languages during pretraining, crosslingual transfer is important for achieving decent performance in non-English settings. In this work, we investigate just how much multilinguality is required during finetuning to elicit strong cross-lingual generalisation across Figure 1: Input/output (IO) language agreement for a range of tasks and target languages. We find English (en), German (de), Bulgarian (bg) and Icelandic that, compared to English-only finetuning, multilingual (is) when instruction tuning on monolingual English instruction tuning with as few as three (Mono) or on multilingual data (Multi-Guanaco).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.12683

Country:

Europe (1.00)
Asia > Middle East (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Sports > Football (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

A Benchmark for Evaluating Machine Translation Metrics on Dialects Without Standard Orthography

Aepli, Noëmi, Amrhein, Chantal, Schottmann, Florian, Sennrich, Rico

arXiv.org Artificial IntelligenceNov-28-2023

For sensible progress in natural language processing, it is important that we are aware of the limitations of the evaluation metrics we use. In this work, we evaluate how robust metrics are to non-standardized dialects, i.e. spelling differences in language varieties that do not have a standard orthography. To investigate this, we collect a dataset of human translations and human judgments for automatic machine translations from English to two Swiss German dialects. We further create a challenge set for dialect variation and benchmark existing metrics' performances. Our results show that existing metrics cannot reliably evaluate Swiss German text generation outputs, especially on segment level. We propose initial design adaptations that increase robustness in the face of non-standardized dialects, although there remains much room for further improvement. The dataset, code, and models are available here: https://github.com/textshuttle/dialect_eval

artificial intelligence, machine translation, natural language, (15 more...)

arXiv.org Artificial Intelligence

2311.16865

Country:

North America > United States (0.68)
Asia > Middle East (0.46)
Europe > Spain (0.46)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model

Amrhein, Chantal, Schottmann, Florian, Sennrich, Rico, Läubli, Samuel

arXiv.org Artificial IntelligenceMay-18-2023

Natural language generation models reproduce and often amplify the biases present in their training data. Previous research explored using sequence-to-sequence rewriting models to transform biased model outputs (or original texts) into more gender-fair language by creating pseudo training data through linguistic rules. However, this approach is not practical for languages with more complex morphology than English. We hypothesise that creating training data in the reverse direction, i.e. starting from gender-fair text, is easier for morphologically complex languages and show that it matches the performance of state-of-the-art rewriting models for English. To eliminate the rule-based nature of data creation, we instead propose using machine translation models to create gender-biased text from real gender-fair text via round-trip translation. Our approach allows us to train a rewriting model for German without the need for elaborate handcrafted rules. The outputs of this model increased gender-fairness as shown in a human evaluation study.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.1114

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Media (0.46)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.86)

Add feedback

State-of-the-art generalisation research in NLP: A taxonomy and review

Hupkes, Dieuwke, Giulianelli, Mario, Dankers, Verna, Artetxe, Mikel, Elazar, Yanai, Pimentel, Tiago, Christodoulopoulos, Christos, Lasri, Karim, Saphra, Naomi, Sinclair, Arabella, Ulmer, Dennis, Schottmann, Florian, Batsuren, Khuyagbaatar, Sun, Kaiser, Sinha, Koustuv, Khalatbari, Leila, Ryskina, Maria, Frieske, Rita, Cotterell, Ryan, Jin, Zhijing

arXiv.org Artificial IntelligenceJan-9-2023

The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what 'good generalisation' entails and how it should be evaluated is not well understood, nor are there any evaluation standards for generalisation. In this paper, we lay the groundwork to address both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they investigate, the type of data shift they consider, the source of this data shift, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis that maps out the current state of generalisation research in NLP, and we make recommendations for which areas might deserve attention in the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to update as new NLP generalisation studies are published. With this work, we aim to take steps towards making state-of-the-art generalisation testing the new status quo in NLP.

machine learning, natural language, reinforcement learning, (25 more...)

arXiv.org Artificial Intelligence

2210.0305

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)
North America > United States > California (0.27)
(2 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Media > News (1.00)
Education (1.00)
Information Technology > Security & Privacy (0.67)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(6 more...)

Add feedback