AITopics

2502.12895

Country:

North America > United States (0.46)
Europe > Germany (0.29)
North America > Mexico (0.28)

Genre: Research Report (0.70)

Industry: Education (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceFeb-18-2025

AlignFreeze: Navigating the Impact of Realignment on the Layers of Multilingual Models Across Diverse Languages

Bakos, Steve, Gaschi, Félix, Guzmán, David, More, Riddhi, Li, Kelly Chutong, Lee, En-Shiun Annie

Realignment techniques are often employed to enhance cross-lingual transfer in multilingual language models, still, they can sometimes degrade performance in languages that differ significantly from the fine-tuned source language. This paper introduces AlignFreeze, a method that freezes either the layers' lower half or upper half during realignment. Through controlled experiments on 4 tasks, 3 models, and in 35 languages, we find that realignment affects all the layers but can be the most detrimental to the lower ones. Freezing the lower layers can prevent performance degradation. Particularly, AlignFreeze improves Part-of-Speech (PoS) tagging performances in languages where full realignment fails: with XLM-R, it provides improvements of more than one standard deviation in accuracy in seven more languages than full realignment.

artificial intelligence, machine learning, natural language, (15 more...)

2502.12959

Country:

Europe (1.00)
Asia (0.92)
North America > Canada > Ontario (0.28)
North America > United States > Minnesota (0.27)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.45)

Kaneko, Masahiro, Aji, Alham Fikri, Baldwin, Timothy

Balanced Multi-Factor In-Context Learning for Multilingual Large Language Models

Multilingual large language models (MLLMs) are able to leverage in-context learning (ICL) to achieve high performance by leveraging cross-lingual knowledge transfer without parameter updates. However, their effectiveness is highly sensitive to example selection, particularly in multilingual settings. Based on the findings of existing work, three key factors influence multilingual ICL: (1) semantic similarity, (2) linguistic alignment, and (3) language-specific performance. However, existing approaches address these factors independently, without explicitly disentangling their combined impact, leaving optimal example selection underexplored. To address this gap, we propose balanced multi-factor ICL (\textbf{BMF-ICL}), a method that quantifies and optimally balances these factors for improved example selection. Experiments on mCSQA and TYDI across four MLLMs demonstrate that BMF-ICL outperforms existing methods. Further analysis highlights the importance of incorporating all three factors and the importance of selecting examples from multiple languages.

computational linguistic, large language model, natural language, (17 more...)

2502.11495

Country:

Asia (0.93)
Europe (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Gomes, Gonçalo, Zerva, Chrysoula, Martins, Bruno

Evaluation of Multilingual Image Captioning: How far can we get with CLIP models?

The evaluation of image captions, looking at both linguistic fluency and semantic correspondence to visual contents, has witnessed a significant effort. Still, despite advancements such as the CLIPScore metric, multilingual captioning evaluation has remained relatively unexplored. This work presents several strategies, and extensive experiments, related to evaluating CLIPScore variants in multilingual settings. To address the lack of multilingual test data, we consider two different strategies: (1) using quality aware machine-translated datasets with human judgements, and (2) re-purposing multilingual datasets that target semantic inference and reasoning. Our results highlight the potential of finetuned multilingual models to generalize across languages and to handle complex linguistic challenges. Tests with machine-translated data show that multilingual CLIPScore models can maintain a high correlation with human judgements across different languages, and additional tests with natively multilingual and multicultural data further attest to the high-quality assessments.

large language model, machine learning, natural language, (20 more...)

2502.066

Country: Europe (1.00)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Mohammadi, Fatemeh, Tamborini, Marta Annamaria, Ceravolo, Paolo, Nardocci, Costanza, Maghool, Samira

Identifying Gender Stereotypes and Biases in Automated Translation from English to Italian using Similarity Networks

This paper is a collaborative effort between Linguistics, Law, and Computer Science to evaluate stereotypes and biases in automated translation systems. We advocate gender-neutral translation as a means to promote gender inclusion and improve the objectivity of machine translation. Our approach focuses on identifying gender bias in English-to-Italian translations. First, we define gender bias following human rights law and linguistics literature. Then we proceed by identifying gender-specific terms such as she/lei and he/lui as key elements. We then evaluate the cosine similarity between these target terms and others in the dataset to reveal the model's perception of semantic relations. Using numerical features, we effectively evaluate the intensity and direction of the bias. Our findings provide tangible insights for developing and training gender-neutral translation algorithms.

artificial intelligence, natural language, translation, (15 more...)

2502.11611

Country:

Europe (0.69)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government (1.00)
Law > International Law (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis

Wu, Chengyan, Ma, Bolei, Liu, Yihong, Zhang, Zheyu, Deng, Ningyuan, Li, Yanshu, Chen, Baolan, Zhang, Yi, Plank, Barbara, Xue, Yun

Aspect-based sentiment analysis (ABSA) is a crucial task in information extraction and sentiment analysis, aiming to identify aspects with associated sentiment elements in text. However, existing ABSA datasets are predominantly English-centric, limiting the scope for multilingual evaluation and research. To bridge this gap, we present M-ABSA, a comprehensive dataset spanning 7 domains and 21 languages, making it the most extensive multilingual parallel dataset for ABSA to date. Our primary focus is on triplet extraction, which involves identifying aspect terms, aspect categories, and sentiment polarities. The dataset is constructed through an automatic translation process with human review to ensure quality. We perform extensive experiments using various baselines to assess performance and compatibility on M-ABSA. Our empirical findings highlight that the dataset enables diverse evaluation tasks, such as multilingual and multi-domain transfer learning, and large language model evaluation, underscoring its inclusivity and its potential to drive advancements in multilingual ABSA research.

large language model, machine learning, natural language, (17 more...)

2502.11824

Country:

Europe (1.00)
Asia > Middle East > UAE (0.28)
North America > United States > Minnesota (0.27)

Genre:

Research Report (0.81)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Education (0.50)
Consumer Products & Services (0.46)
Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(2 more...)

Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu

Pei, Renhao, Liu, Yihong, Lin, Peiqin, Yvon, François, Schütze, Hinrich

In-context machine translation (MT) with large language models (LLMs) is a promising approach for low-resource MT, as it can readily take advantage of linguistic resources such as grammar books and dictionaries. Such resources are usually selectively integrated into the prompt so that LLMs can directly perform translation without any specific training, via their in-context learning capability (ICL). However, the relative importance of each type of resource e.g., dictionary, grammar book, and retrieved parallel examples, is not entirely clear. To address this gap, this study systematically investigates how each resource and its quality affects the translation performance, with the Manchu language as our case study. To remove any prior knowledge of Manchu encoded in the LLM parameters and single out the effect of ICL, we also experiment with an encrypted version of Manchu texts. Our results indicate that high-quality dictionaries and good parallel examples are very helpful, while grammars hardly help. In a follow-up study, we showcase a promising application of in-context MT: parallel data augmentation as a way to bootstrap the conventional MT model. When monolingual data abound, generating synthetic parallel data through in-context MT offers a pathway to mitigate data scarcity and build effective and efficient low-resource neural MT systems.

large language model, machine learning, natural language, (19 more...)

2502.11862

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Moslem, Yasmin, Morán, Juan Julián Cea, Gonzalez-Gomez, Mariano, Farouq, Muhammad Hazim Al, Abdou, Farah, Deb, Satarupa

SpeechT: Findings of the First Mentorship in Speech Translation

This work presents the details and findings of the first mentorship in speech translation (SpeechT), which took place in December 2024 and January 2025. To fulfil the requirements of the mentorship, the participants engaged in key activities, including data preparation, modelling, and advanced research.

artificial intelligence, natural language, translation, (14 more...)

2502.1205

Country:

Europe (0.93)
North America > United States (0.29)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Adak, Sayantan, Mukherjee, Animesh

RA-MTR: A Retrieval Augmented Multi-Task Reader based Approach for Inspirational Quote Extraction from Long Documents

Inspirational quotes from famous individuals are often used to convey thoughts in news articles, essays, and everyday conversations. In this paper, we propose a novel context-based quote extraction system that aims to extract the most relevant quote from a long text. We formulate this quote extraction as an open domain question answering problem first by employing a vector-store based retriever and then applying a multi-task reader. We curate three context-based quote extraction datasets and introduce a novel multi-task framework RA-MTR that improves the state-of-the-art performance, achieving a maximum improvement of 5.08% in BoW F1-score.

information retrieval, large language model, machine learning, (23 more...)

2502.12124

Country:

Europe (0.67)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Shahin, Nada, Ismail, Leila

GLoT: A Novel Gated-Logarithmic Transformer for Efficient Sign Language Translation

Emirates Center for Mobility Research UAE University, Al-Ain, United Arab Emirates Correspondence: Leila@uaeu.ac.ae Abstract - Machine Translation has played a critical role in Language Machine Translation (SLMT) has been less explored. In this The number of Deaf and Hard of Hearing (DHH) paper, we aim to address this void by proposing the Gated population is expected to double to 860 million by 2050 [1]. Logarithmic Transformer (GLoT), which introduces a In addition to the existence of more than hundreds of sign gating mechanism that selectively filters out irrelevant languages [2], and an acute shortage of sign language information, ensuring that only the most critical temporal interpreters [3], there is a pressing need for automated and dependencies are retained [16]. By incorporating precise Sign Language Machine Translation (SLMT) logarithmic transformations, GLoT is designed to better systems. Having an inclusive communication could be capture long-range temporal patterns, improving the lifesaving in a tragic event such as a medical emergency.

machine learning, natural language, translation, (19 more...)

2502.12223

Country: Asia > Middle East > UAE (0.55)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Education > Curriculum > Subject-Specific Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)