AITopics | Birch, Alexandra

Plotting

Birch, Alexandra

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

XL-Instruct: Synthetic Data for Cross-Lingual Open-Ended Generation

Iyer, Vivek, Rei, Ricardo, Chen, Pinzhen, Birch, Alexandra

arXiv.org Artificial IntelligenceMar-29-2025

Cross-lingual open-ended generation -- i.e. generating responses in a desired language different from that of the user's query -- is an important yet understudied problem. We introduce XL-AlpacaEval, a new benchmark for evaluating cross-lingual generation capabilities in Large Language Models (LLMs), and propose XL-Instruct, a high-quality synthetic data generation method. Fine-tuning with just 8K XL-Instruct-generated instructions significantly improves model performance, increasing the win rate against GPT-4o-Mini from 7.4% to 21.5%, and improving on several fine-grained quality metrics. Additionally, models fine-tuned on XL-Instruct exhibit strong zero-shot transfer to both English-only and multilingual generation tasks. Given its consistent gains across the board, we strongly recommend incorporating XL-Instruct in the post-training pipeline of future multilingual LLMs. To facilitate further research, we will publicly and freely release the XL-Instruct and XL-AlpacaEval datasets, which constitute two of the few cross-lingual resources currently available in the literature.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.22973

Country:

Europe (0.93)
North America > Mexico (0.28)
Asia > Middle East (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning

Zhu, Wenhao, Chen, Pinzhen, Hu, Hanxu, Huang, Shujian, Yuan, Fei, Chen, Jiajun, Birch, Alexandra

arXiv.org Artificial IntelligenceFeb-21-2025

Long-context modelling for large language models (LLMs) has been a key area of recent research because many real world use cases require reasoning over longer inputs such as documents. The focus of research into modelling long context has been on how to model position and there has been little investigation into other important aspects of language modelling such as instruction tuning. Long context training examples are challenging and expensive to create and use. In this paper, we investigate how to design instruction data for the post-training phase of a long context pre-trained model: how much and what type of context is needed for optimal and efficient post-training. Our controlled study reveals that models instruction-tuned on short contexts can effectively generalize to longer ones, while also identifying other critical factors such as instruction difficulty and context composition. Based on these findings, we propose context synthesis, a novel data synthesis framework that leverages off-the-shelf LLMs to generate extended background contexts for high-quality instruction-answer pairs. Experiment results on the document-level benchmark (LongBench) demonstrate that our proposed approach outperforms previous instruction synthesis approaches and comes close to the performance of human-annotated long-context instruction data. The project will be available at: https://github.com/NJUNLP/context-synthesis.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.15592

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Health & Medicine > Consumer Health (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Demystifying Multilingual Chain-of-Thought in Process Reward Modeling

Wang, Weixuan, Wu, Minghao, Haddow, Barry, Birch, Alexandra

arXiv.org Artificial IntelligenceFeb-18-2025

Large language models (LLMs) are designed to perform a wide range of tasks. To improve their ability to solve complex problems requiring multi-step reasoning, recent research leverages process reward modeling to provide fine-grained feedback at each step of the reasoning process for reinforcement learning (RL), but it predominantly focuses on English. In this paper, we tackle the critical challenge of extending process reward models (PRMs) to multilingual settings. To achieve this, we train multilingual PRMs on a dataset spanning seven languages, which is translated from English. Through comprehensive evaluations on two widely used reasoning benchmarks across 11 languages, we demonstrate that multilingual PRMs not only improve average accuracy but also reduce early-stage reasoning errors. Furthermore, our results highlight the sensitivity of multilingual PRMs to both the number of training languages and the volume of English data, while also uncovering the benefits arising from more candidate responses and trainable parameters. This work opens promising avenues for robust multilingual applications in complex, multi-step reasoning tasks. In addition, we release the code to foster research along this line.

demystifying multilingual chain-of-thought, large language model, natural language, (2 more...)

arXiv.org Artificial Intelligence

2502.12663

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

The Only Way is Ethics: A Guide to Ethical Research with Large Language Models

Ungless, Eddie L., Vitsakis, Nikolas, Talat, Zeerak, Garforth, James, Ross, Björn, Onken, Arno, Kasirzadeh, Atoosa, Birch, Alexandra

arXiv.org Artificial IntelligenceDec-20-2024

There is a significant body of work looking at the ethical considerations of large language models (LLMs): critiquing tools to measure performance and harms; proposing toolkits to aid in ideation; discussing the risks to workers; considering legislation around privacy and security etc. As yet there is no work that integrates these resources into a single practical guide that focuses on LLMs; we attempt this ambitious goal. We introduce 'LLM Ethics Whitepaper', which we provide as an open and living resource for NLP practitioners, and those tasked with evaluating the ethical implications of others' work. Our goal is to translate ethics literature into concrete recommendations and provocations for thinking with clear first steps, aimed at computer scientists. 'LLM Ethics Whitepaper' distils a thorough literature review into clear Do's and Don'ts, which we present also in this paper. We likewise identify useful toolkits to support ethical work. We refer the interested reader to the full LLM Ethics Whitepaper, which provides a succinct discussion of ethical considerations at each stage in a project lifecycle, as well as citations for the hundreds of papers from which we drew our recommendations. The present paper can be thought of as a pocket guide to conducting ethical research with LLMs.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.16022

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Generics are puzzling. Can language models find the missing piece?

Calderón, Gustavo Cilleruelo, Allaway, Emily, Haddow, Barry, Birch, Alexandra

arXiv.org Artificial IntelligenceDec-15-2024

Generic sentences express generalisations about the world without explicit quantification. Although generics are central to everyday communication, building a precise semantic framework has proven difficult, in part because speakers use generics to generalise properties with widely different statistical prevalence. In this work, we study the implicit quantification and context-sensitivity of generics by leveraging language models as models of language. We create ConGen, a dataset of 2873 naturally occurring generic and quantified sentences in context, and define p-acceptability, a metric based on surprisal that is sensitive to quantification. Our experiments show generics are more context-sensitive than determiner quantifiers and about 20% of naturally occurring generics we analyze express weak generalisations. We also explore how human biases in stereotypes can be observed in language models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.11318

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models

Ungless, Eddie L., Vitsakis, Nikolas, Talat, Zeerak, Garforth, James, Ross, Björn, Onken, Arno, Kasirzadeh, Atoosa, Birch, Alexandra

arXiv.org Artificial IntelligenceOct-17-2024

This whitepaper offers an overview of the ethical considerations surrounding research into or with large language models (LLMs). As LLMs become more integrated into widely used applications, their societal impact increases, bringing important ethical questions to the forefront. With a growing body of work examining the ethical development, deployment, and use of LLMs, this whitepaper provides a comprehensive and practical guide to best practices, designed to help those in research and in industry to uphold the highest ethical standards in their work.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.19812

Country:

Europe (1.00)
North America > Canada (0.68)
Asia > Middle East > UAE (0.28)
(2 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.45)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention

Wang, Weixuan, Wu, Minghao, Haddow, Barry, Birch, Alexandra

arXiv.org Artificial IntelligenceOct-16-2024

Large Language Models (LLMs) have shown remarkable capabilities in natural language processing but exhibit significant performance gaps among different languages. Most existing approaches to address these disparities rely on pretraining or fine-tuning, which are resource-intensive. To overcome these limitations without incurring significant costs, we propose Inference-Time Cross-Lingual Intervention (INCLINE), a novel framework that enhances LLM performance on low-performing (source) languages by aligning their internal representations with those of high-performing (target) languages during inference. INCLINE initially learns alignment matrices using parallel sentences from source and target languages through a Least-Squares optimization, and then applies these matrices during inference to transform the low-performing language representations toward the high-performing language space. Extensive experiments on nine benchmarks with five LLMs demonstrate that INCLINE significantly improves performance across diverse tasks and languages, compared to recent strong baselines. Our analysis demonstrates that INCLINE is highly cost-effective and applicable to a wide range of applications. In addition, we release the code to foster research along this line: https://github.com/weixuan-wang123/INCLINE.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.12462

Country:

Europe (0.67)
North America > Mexico (0.28)
North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EuroLLM: Multilingual Language Models for Europe

Martins, Pedro Henrique, Fernandes, Patrick, Alves, João, Guerreiro, Nuno M., Rei, Ricardo, Alves, Duarte M., Pombal, José, Farajian, Amin, Faysse, Manuel, Klimaszewski, Mateusz, Colombo, Pierre, Haddow, Barry, de Souza, José G. C., Birch, Alexandra, Martins, André F. T.

arXiv.org Artificial IntelligenceSep-24-2024

The quality of open-weight LLMs has seen significant improvement, yet they remain predominantly focused on English. In this paper, we introduce the EuroLLM project, aimed at developing a suite of open-weight multilingual LLMs capable of understanding and generating text in all official European Union languages, as well as several additional relevant languages. We outline the progress made to date, detailing our data collection and filtering process, the development of scaling laws, the creation of our multilingual tokenizer, and the data mix and modeling configurations. Additionally, we release our initial models: EuroLLM-1.7B and EuroLLM-1.7B-Instruct and report their performance on multilingual general benchmarks and machine translation.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2409.16235

Country:

Europe > Portugal (0.14)
Asia > Thailand (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.83)

Industry: Government > Regional Government > Europe Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights

Zhu, Wenhao, Huang, Shujian, Yuan, Fei, Chen, Cheng, Chen, Jiajun, Birch, Alexandra

arXiv.org Artificial IntelligenceJun-29-2024

Bridging the significant gap between large language model's English and non-English performance presents a great challenge. While some previous studies attempt to mitigate this gap with translated training data, the recently proposed question alignment approach leverages the model's English expertise to improve multilingual performance with minimum usage of expensive, error-prone translation. In this paper, we explore how broadly this method can be applied by examining its effects in reasoning with executable code and reasoning with common sense. We also explore how to apply this approach efficiently to extremely large language models using proxy-tuning. Experiment results on multilingual reasoning benchmarks mGSM, mSVAMP and xCSQA demonstrate that the question alignment approach can be used to boost multilingual performance across diverse reasoning scenarios, model families, and sizes. For instance, when applied to the LLaMA2 models, our method brings an average accuracy improvements of 12.2% on mGSM even with the 70B model. To understand the mechanism of its success, we analyze representation space, chain-of-thought and translation data scales, which reveals how question translation training strengthens language alignment within LLMs and shapes their working patterns.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.01345

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Compact Speech Translation Models via Discrete Speech Units Pretraining

Lam, Tsz Kin, Birch, Alexandra, Haddow, Barry

arXiv.org Artificial IntelligenceJun-26-2024

We propose a pretraining method to use Self-Supervised Speech (SSS) model to creating more compact Speech-to-text Translation. In contrast to using the SSS model for initialization, our method is more suitable to memory constrained scenario such as on-device deployment. Our method is based on Discrete Speech Units (DSU) extracted from the SSS model. In the first step, our method pretrains two smaller encoder-decoder models on 1) Filterbank-to-DSU (Fbk-to-DSU) and 2) DSU-to-Translation (DSU-to-Trl) data respectively. The DSU thus become the distillation inputs of the smaller models. Subsequently, the encoder from the Fbk-to-DSU model and the decoder from the DSU-to-Trl model are taken to initialise the compact model. Finally, the compact model is finetuned on the paired Fbk-Trl data. In addition to being compact, our method requires no transcripts, making it applicable to low-resource settings. It also avoids speech discretization in inference and is more robust to the DSU tokenization. Evaluation on CoVoST-2 (X-En) shows that our method has consistent improvement over the baseline in three metrics while being compact i.e., only half the SSS model size.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.19333

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback