AITopics

2503.07137

Country:

North America > United States > Texas > Harris County > Houston (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.92)
Research Report > New Finding (0.67)

Industry:

Information Technology (1.00)
Education (1.00)
Health & Medicine (0.92)
Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(8 more...)

arXiv.org Artificial IntelligenceMar-10-2025

Revisiting Noise in Natural Language Processing for Computational Social Science

Borenstein, Nadav

Computational Social Science (CSS) is an emerging field driven by the unprecedented availability of human-generated content for researchers. This field, however, presents a unique set of challenges due to the nature of the theories and datasets it explores, including highly subjective tasks and complex, unstructured textual corpora. Among these challenges, one of the less well-studied topics is the pervasive presence of noise. This thesis aims to address this gap in the literature by presenting a series of interconnected case studies that examine different manifestations of noise in CSS. These include character-level errors following the OCR processing of historical records, archaic language, inconsistencies in annotations for subjective and ambiguous tasks, and even noise and biases introduced by large language models during content generation. This thesis challenges the conventional notion that noise in CSS is inherently harmful or useless. Rather, it argues that certain forms of noise can encode meaningful information that is invaluable for advancing CSS research, such as the unique communication styles of individuals or the culture-dependent nature of datasets and tasks. Further, this thesis highlights the importance of nuance in dealing with noise and the considerations CSS researchers must address when encountering it, demonstrating that different types of noise require distinct strategies.

camembert-ft-sq-fr camembert-ft-sq-fr 54 54 52, convenient qualitative analysis and visualisation, hedonism pleasure and sensuous gratification, (16 more...)

2503.07395

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Poland (0.14)
Europe > Finland (0.14)
(130 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Media > News (1.00)
Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(4 more...)

Shahnazaryan, Lia, Simianer, Patrick, Wuebker, Joern

Contextual Cues in Machine Translation: Investigating the Potential of Multi-Source Input Strategies in LLMs and NMT Systems

arXiv.org Artificial IntelligenceMar-10-2025

We explore the impact of multi-source input strategies on machine translation (MT) quality, comparing GPT-4o, a large language model (LLM), with a traditional multilingual neural machine translation (NMT) system. Using intermediate language translations as contextual cues, we evaluate their effectiveness in enhancing English and Chinese translations into Portuguese. Results suggest that contextual information significantly improves translation quality for domain-specific datasets and potentially for linguistically distant language pairs, with diminishing returns observed in benchmarks with high linguistic variability. Additionally, we demonstrate that shallow fusion, a multi-source approach we apply within the NMT system, shows improved results when using high-resource languages as context for other translation pairs, highlighting the importance of strategic context language selection.

context language, dataset, translation, (13 more...)

2503.07195

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Virginia (0.04)
North America > United States > Pennsylvania (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.32)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

arXiv.org Artificial IntelligenceMar-9-2025

Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation

Luo, Yingfeng, Zheng, Tong, Mu, Yongyu, Li, Bei, Zhang, Qinghong, Gao, Yongqi, Xu, Ziqiang, Feng, Peinan, Liu, Xiaoqian, Xiao, Tong, Zhu, Jingbo

The field of neural machine translation (NMT) has changed with the advent of large language models (LLMs). Much of the recent emphasis in natural language processing (NLP) has been on modeling machine translation and many other problems using a single pre-trained Transformer decoder, while encoder-decoder architectures, which were the standard in earlier NMT models, have received relatively less attention. In this paper, we explore translation models that are universal, efficient, and easy to optimize, by marrying the world of LLMs with the world of NMT. We apply LLMs to NMT encoding and leave the NMT decoder unchanged. We also develop methods for adapting LLMs to work better with the NMT decoder. Furthermore, we construct a new dataset involving multiple tasks to assess how well the machine translation system generalizes across various tasks. Evaluations on the WMT and our datasets show that results using our method match or surpass a range of baselines in terms of translation quality, but achieve $2.4 \sim 6.5 \times$ inference speedups and a $75\%$ reduction in the memory footprint of the KV cache. It also demonstrates strong generalization across a variety of translation-related tasks.

computational linguistic, representation, translation, (15 more...)

2503.06594

Country:

North America > United States (0.14)
Asia > Singapore (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(14 more...)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Artificial IntelligenceMar-9-2025

Sign Language Translation using Frame and Event Stream: Benchmark Dataset and Algorithms

Wang, Xiao, Li, Yuehang, Wang, Fuling, Jiang, Bo, Wang, Yaowei, Tian, Yonghong, Tang, Jin, Luo, Bin

Accurate sign language understanding serves as a crucial communication channel for individuals with disabilities. Current sign language translation algorithms predominantly rely on RGB frames, which may be limited by fixed frame rates, variable lighting conditions, and motion blur caused by rapid hand movements. Inspired by the recent successful application of event cameras in other fields, we propose to leverage event streams to assist RGB cameras in capturing gesture data, addressing the various challenges mentioned above. Specifically, we first collect a large-scale RGB-Event sign language translation dataset using the DVS346 camera, termed VECSL, which contains 15,676 RGB-Event samples, 15,191 glosses, and covers 2,568 Chinese characters. These samples were gathered across a diverse range of indoor and outdoor environments, capturing multiple viewing angles, varying light intensities, and different camera motions. Due to the absence of benchmark algorithms for comparison in this new task, we retrained and evaluated multiple state-of-the-art SLT algorithms, and believe that this benchmark can effectively support subsequent related research. Additionally, we propose a novel RGB-Event sign language translation framework (i.e., M$^2$-SLT) that incorporates fine-grained micro-sign and coarse-grained macro-sign retrieval, achieving state-of-the-art results on the proposed dataset. Both the source code and dataset will be released on https://github.com/Event-AHU/OpenESL.

dataset, proceedings, sign language translation, (10 more...)

2503.06484

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report (0.64)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Media (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Sultan, Afnan, Rausch-Dupont, Max, Khan, Shahrukh, Kalinina, Olga, Volkamer, Andrea, Klakow, Dietrich

Transformers for molecular property prediction: Domain adaptation efficiently improves performance

arXiv.org Artificial IntelligenceMar-7-2025

Most of the current transformer-based chemical language models are pre-trained on millions to billions of molecules. However, the improvement from such scaling in dataset size is not confidently linked to improved molecular property prediction. The aim of this study is to investigate and overcome some of the limitations of transformer models in predicting molecular properties. Specifically, we examine the impact of pre-training dataset size and diversity on the performance of transformer models and investigate the use of domain adaptation as a technique for improving model performance. First, our findings indicate that increasing pretraining dataset size beyond 400K molecules from the GuacaMol dataset does not result in a significant improvement on four ADME endpoints, namely, solubility, permeability, microsomal stability, and plasma protein binding. Second, our results demonstrate that using domain adaptation by further training the transformer model on a small set of domain-relevant molecules, i.e., a few hundred to a few thousand, using multi-task regression of physicochemical properties was sufficient to significantly improve performance for three out of the four investigated ADME endpoints (P-value < 0.001). Finally, we observe that a model pre-trained on 400K molecules and domain adopted on a few hundred/thousand molecules performs similarly (P-value > 0.05) to more complicated transformer models like MolBERT(pre-trained on 1.3M molecules) and MolFormer (pre-trained on 100M molecules). A comparison to a random forest model trained on basic physicochemical properties showed similar performance to the examined transformer models. We believe that current transformer models can be improved through further systematic analysis of pre-training and downstream data, pre-training objectives, and scaling laws, ultimately leading to better and more helpful models.

dataset size, molecule, objective, (15 more...)

2503.0336

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Saarland > Saarbrücken (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences

Shahid, Adnan, Kliks, Adrian, Al-Tahmeesschi, Ahmed, Elbakary, Ahmed, Nikou, Alexandros, Maatouk, Ali, Mokh, Ali, Kazemi, Amirreza, De Domenico, Antonio, Karapantelakis, Athanasios, Cheng, Bo, Yang, Bo, Wang, Bohao, Fischione, Carlo, Zhang, Chao, Issaid, Chaouki Ben, Yuen, Chau, Peng, Chenghui, Huang, Chongwen, Chaccour, Christina, Thomas, Christo Kurisummoottil, Sharma, Dheeraj, Kalogiros, Dimitris, Niyato, Dusit, De Poorter, Eli, Mhanna, Elissa, Strinati, Emilio Calvanese, Bader, Faouzi, Abdeldayem, Fathi, Wang, Fei, Zhu, Fenghao, Fontanesi, Gianluca, Geraci, Giovanni, Zhou, Haibo, Purmehdi, Hakimeh, Ahmadi, Hamed, Zou, Hang, Du, Hongyang, Lee, Hoon, Yang, Howard H., Poli, Iacopo, Carron, Igor, Chatzistefanidis, Ilias, Lee, Inkyu, Pitsiorlas, Ioannis, Fontaine, Jaron, Wu, Jiajun, Zeng, Jie, Li, Jinan, Karam, Jinane, Gemayel, Johny, Deng, Juan, Frison, Julien, Huang, Kaibin, Qiu, Kehai, Ball, Keith, Wang, Kezhi, Guo, Kun, Tassiulas, Leandros, Gwenole, Lecorve, Yue, Liexiang, Bariah, Lina, Powell, Louis, Dryjanski, Marcin, Galdon, Maria Amparo Canaveras, Kountouris, Marios, Hafeez, Maryam, Elkael, Maxime, Bennis, Mehdi, Boudjelli, Mehdi, Dai, Meiling, Debbah, Merouane, Polese, Michele, Assaad, Mohamad, Benzaghta, Mohamed, Refai, Mohammad Al, Djerrab, Moussab, Syed, Mubeen, Amir, Muhammad, Yan, Na, Alkaabi, Najla, Li, Nan, Sehad, Nassim, Nikaein, Navid, Hashash, Omar, Sroka, Pawel, Yang, Qianqian, Zhao, Qiyang, Silab, Rasoul Nikbakht, Ying, Rex, Morabito, Roberto, Li, Rongpeng, Madi, Ryad, Ayoubi, Salah Eddine El, D'Oro, Salvatore, Lasaulce, Samson, Shalmashi, Serveh, Liu, Sige, Cherrared, Sihem, Chetty, Swarna Bindu, Dutta, Swastika, Zaidi, Syed A. R., Chen, Tianjiao, Murphy, Timothy, Melodia, Tommaso, Quek, Tony Q. S., Ram, Vishnu, Saad, Walid, Hamidouche, Wassim, Chen, Weilong, Liu, Xiaoou, Yu, Xiaoxue, Wang, Xijun, Shang, Xingyu, Wang, Xinquan, Cao, Xuelin, Su, Yang, Liang, Yanping, Deng, Yansha, Yang, Yifan, Cui, Yingping, Sun, Yu, Chen, Yuxuan, Pointurier, Yvan, Nehme, Zeinab, Nezami, Zeinab, Yang, Zhaohui, Zhang, Zhaoyang, Liu, Zhe, Yang, Zhenyu, Han, Zhu, Zhou, Zhuang, Chen, Zihan, Chen, Zirui, Shuai, Zitao

The rise of generative artificial intelligence (AI) as a novel frontier that uniquely merges advanced levels of intelligence with revolutionary user experiences is redefining the AI landscape for future cellular networks. In particular, the transition towards 6G systems has introduced a myriad of challenges inherent to their AI-native network design, requiring innovative solutions to enable real-time network orchestration, intelligent decision-making, and adaptive dynamic configurations. Meanwhile, the envisioned user experiences for 6G are growing increasingly complex, exceeding the capabilities offered by vintage wireless technologies and conventional AI solutions to satisfy their advanced demands. With its disruptive impact evident across diverse fields, generative AI possesses immense potential to tackle these challenges, leveraging its exceptional capabilities to manage complex tasks, operate autonomously, and adapt seamlessly to scenarios beyond its training domain. Remarkably, generative AI provides a transformative opportunity for telecom and cellular networks to bridge this defined gap in 6G systems, thereby shifting towards a new era with cutting-edge AI innovations across the different system and user levels.

large language model, machine learning, real time system, (29 more...)

2503.04184

Country:

Asia > China (0.27)
North America > Canada (0.27)
Europe > Germany (0.14)
(6 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Telecommunications > Networks (1.00)
Media (1.00)
Leisure & Entertainment (1.00)
(17 more...)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(16 more...)

Mastromichalakis, Orfeas Menis, Filandrianos, Giorgos, Symeonaki, Maria, Stamou, Giorgos

Assumed Identities: Quantifying Gender Bias in Machine Translation of Ambiguous Occupational Terms

Machine Translation (MT) systems frequently encounter ambiguous scenarios where they must assign gender to certain occupations when translating without explicit guidance or contextual cues. While individual translations in such cases may not be inherently biased, systematic patterns-such as the repeated association of certain professions with specific genders-can emerge, reflecting and perpetuating societal stereotypes. This ambiguity challenges traditional instance-level single-answer evaluation approaches, as no single gold standard translation exists. To address this, we propose an approach that evaluates gender bias through aggregated model responses. Specifically, we introduce a methodology to detect gender imbalances between source texts and translations, a benchmarking dataset with ambiguous English inputs, and probability-based metrics to quantify a model's divergence from normative standards or reference distributions.

computational linguistic, occupation, translation, (12 more...)

2503.04372

Country:

Europe > Italy (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Louisiana (0.14)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry: Law > Civil Rights & Constitutional Law (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation

Li, Bryan, Luo, Jiaming, Briakou, Eleftheria, Cherry, Colin

While large language models (LLMs) have been increasingly adopted for machine translation (MT), their performance for specialist domains such as medicine and law remains an open challenge. Prior work has shown that LLMs can be domain-adapted at test-time by retrieving targeted few-shot demonstrations or terminologies for inclusion in the prompt. Meanwhile, for general-purpose LLM MT, recent studies have found some success in generating similarly useful domain knowledge from an LLM itself, prior to translation. Our work studies domain-adapted MT with LLMs through a careful prompting setup, finding that demonstrations consistently outperform terminology, and retrieval consistently outperforms generation. We find that generating demonstrations with weaker models can close the gap with larger model's zero-shot performance. Given the effectiveness of demonstrations, we perform detailed analyses to understand their value. We find that domain-specificity is particularly important, and that the popular multi-domain benchmark is testing adaptation to a particular writing style more so than to a specific domain.

demonstration, terminology, translation, (14 more...)

2503.0501

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(9 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Zebaze, Armel, Sagot, Benoît, Bawden, Rachel

Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation

The ability of generative large language models (LLMs) to perform in-context learning has given rise to a large body of research into how best to prompt models for various natural language processing tasks. Machine Translation (MT) has been shown to benefit from in-context examples, in particular when they are semantically similar to the sentence to translate. In this paper, we propose a new LLM-based translation paradigm, compositional translation, to replace naive few-shot MT with similarity-based demonstrations. An LLM is used to decompose a sentence into simpler phrases, and then to translate each phrase with the help of retrieved demonstrations. Finally, the LLM is prompted to translate the initial sentence with the help of the self-generated phrase-translation pairs. Our intuition is that this approach should improve translation because these shorter phrases should be intrinsically easier to translate and easier to match with relevant examples. This is especially beneficial in low-resource scenarios, and more generally whenever the selection pool is small or out of domain. We show that compositional translation boosts LLM translation performance on a wide range of popular MT benchmarks, including FLORES 200, NTREX 128 and TICO-19. Code and outputs are available at https://github.com/ArmelRandy/compositional-translation

compositional translation, comptra, translation, (12 more...)

2503.04554

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(22 more...)

Genre:

Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)