AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Lupascu, Marian, Rogoz, Ana-Cristina, Stupariu, Mihai Sorin, Ionescu, Radu Tudor

Large Multimodal Models for Low-Resource Languages: A Survey

arXiv.org Artificial IntelligenceFeb-8-2025

In this survey, we systematically analyze techniques used to adapt large multimodal models (LMMs) for low-resource (LR) languages, examining approaches ranging from visual enhancement and data creation to cross-modal transfer and fusion strategies. Through a comprehensive analysis of 106 studies across 75 LR languages, we identify key patterns in how researchers tackle the challenges of limited data and computational resources. We find that visual information often serves as a crucial bridge for improving model performance in LR settings, though significant challenges remain in areas such as hallucination mitigation and computational efficiency. We aim to provide researchers with a clear understanding of current approaches and remaining challenges in making LMMs more accessible to speakers of LR (understudied) languages. We complement our survey with an open-source repository available at: https://github.com/marianlupascu/LMM4LRL-Survey.

artificial intelligence, machine learning, natural language, (20 more...)

2502.05568

Country:

North America > United States (0.06)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)

Genre:

Research Report (0.82)
Overview (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceFeb-8-2025

ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data

Liu, Xiaoyang, Bao, Kangjie, Zhang, Jiashuo, Liu, Yunqi, Chen, Yu, Liu, Yuntian, Jiao, Yang, Luo, Tao

Autoformalization, the process of automatically translating natural language mathematics into machine-verifiable formal language, has demonstrated advancements with the progress of large language models (LLMs). However, a key obstacle to further advancements is the scarcity of paired datasets that align natural language with formal language. To address this challenge, we introduce ATLAS (Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data), an iterative data generation framework designed to produce large-scale, high-quality parallel theorem statements. With the proposed ATLAS running for 10 iterations, we construct an undergraduate-level dataset comprising 300k theorem statements and develop the ATLAS translator, achieving accuracies of 80.59% (pass@8) and 92.99% (pass@128) on ProofNet, significantly outperforming the base model (23.99% and 47.17%) and InternLM2-Math-Plus-7B (50.94% and 80.32%). Furthermore, the ATLAS translator also achieves state-of-the-art performance on both the high-school-level miniF2F dataset and the graduate-level MathQual dataset introduced in this work. The datasets, model, and code will be released to the public soon.

large language model, logic & formal reasoning, machine learning, (24 more...)

2502.05567

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Texas (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Neural Information Processing SystemsFeb-7-2025, 21:19:23 GMT

Review for NeurIPS paper: Unsupervised Translation of Programming Languages

Reviewers agree that this paper is a significant advance in the problem of language translation. One lingering concern is with the positioning of the paper. In particular, the introduction needs to do a better job in recognizing that this paper focuses on small self-contained units of code. In order to be useful in a software engineering context, a translation tool would have to address a number of problems that are not addressed by this work, such as major differences in the design patterns used by APIs in different languages. Without a proper acknowledgment of the limitations of the approach early in the paper, this paper could make it difficult to publish follow-up work.

neurips paper, programming language, unsupervised translation

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)

Neural Information Processing SystemsFeb-7-2025, 14:02:34 GMT

Review for NeurIPS paper: Estimating Training Data Influence by Tracing Gradient Descent

Weaknesses: I have some major concerns with the evaluation part of the paper. A simple baseline could be a loss based selection method. Simply select training points based on loss change. A recent paper [DataLens IJCNN 20] shows that a simple loss based selection outperforms both influence functions and representer selection on mislabelled data identification when the mislabeled data is small. As the fraction of mislabelled data increases, influence function works better than loss based method.

neurips paper, tracing gradient descent, training data influence, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

EngadgetFeb-7-2025, 13:00:50 GMT

Meta and UNESCO team up to improve translation AI

Meta has partnered with UNESCO on a new plan to improve translation and speech recognition AI, Techcrunch reported. As part of its Language Technology Partner Program, Meta is seeking collaborators willing to donate at least 10 hours of speech recordings with transcriptions, large written texts (200-plus sentences) and sets of translated sentences. The aim is to focus on "underserved languages, in support of UNESCO's work," Meta wrote in a blog post. So far, Meta and UNESCO have signed on the government of Nunavut, a northern Canadian territory. The aim is to develop translation systems for the Intuit languages used there, Inuktitut and Inuinnaqtun.

meta and unesco team, translation ai, underserved language

Engadget

Country:

North America > Canada > Nunavut (0.28)
North America > United States (0.08)

Industry: Government (0.61)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.61)

arXiv.org Artificial IntelligenceFeb-7-2025

Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study

Cui, Menglong, Gao, Pengzhi, Liu, Wei, Luan, Jian, Wang, Bin

Large language models (LLMs) have shown continuously improving multilingual capabilities, and even small-scale open-source models have demonstrated rapid performance enhancement. In this paper, we systematically explore the abilities of open LLMs with less than ten billion parameters to handle multilingual machine translation (MT) tasks. We conduct comprehensive evaluations on six popular LLMs and find that models like Gemma2-9B exhibit impressive multilingual translation capabilities. We then introduce the Parallel-First Monolingual-Second (PFMS) data mixing strategy in the continual pretraining stage to further enhance the MT performance and present GemmaX2-28, a 9B model achieving top-tier multilingual translation performance across 28 languages. Specifically, GemmaX2-28 consistently outperforms the state-of-the-art (SOTA) models such as TowerInstruct and XALMA and achieves competitive performance with Google Translate and GPT-4-turbo.

large language model, machine learning, natural language, (19 more...)

2502.02481

Country:

Europe > Austria > Vienna (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(17 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Ticona, Belu, Carranza, Fernando, Cotik, Viviana

Indigenous Languages Spoken in Argentina: A Survey of NLP and Speech Resources

arXiv.org Artificial IntelligenceFeb-7-2025

Argentina has a large yet little-known Indigenous linguistic diversity, encompassing at least 40 different languages. The majority of these languages are at risk of disappearing, resulting in a significant loss of world heritage and cultural knowledge. Currently, unified information on speakers and computational tools is lacking for these languages. In this work, we present a systematization of the Indigenous languages spoken in Argentina, classifying them into seven language families: Mapuche, Tup\'i-Guaran\'i, Guaycur\'u, Quechua, Mataco-Mataguaya, Aymara, and Chon. For each one, we present an estimation of the national Indigenous population size, based on the most recent Argentinian census. We discuss potential reasons why the census questionnaire design may underestimate the actual number of speakers. We also provide a concise survey of computational resources available for these languages, whether or not they were specifically developed for Argentinian varieties.

computational linguistic, machine learning, natural language, (19 more...)

2501.09943

Country:

South America > Paraguay (0.15)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.06)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.05)
(26 more...)

Genre: Overview (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.96)

Neural Information Processing SystemsFeb-6-2025, 15:03:57 GMT

Export Reviews, Discussions, Author Feedback and Meta-Reviews

The paper tackles (constituent) syntactic parsing by mapping this prediction problem to a sequence-to-sequence alignment problem, and then essentially applying a method recently developed in the context of neural machine translation (LSTM-encoder-decoder with an attention mechanism). The resulting parsing model achieves state-of-the-art results when used in the standard supervised set-up (PTB WSJ) and improves further when estimated in a semi-supervised / co-training regime. What I find especially interesting in this paper is that the attention mechanism is crucial for attaining good generalization properties: without using the attention mechanism LSTM achieves very poor results in the supervised setting. This is an interesting observation which may in principle generate future work focusing on refining the attention model (e.g., moving more in a direction of Neural Turing machines of Graves et al.). This is also somewhat surprising that such simple linearization strategy led to state-of-the-art performance.

author feedback and meta-review, discussion, export review, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

arXiv.org Artificial IntelligenceFeb-6-2025

Multilingual Non-Autoregressive Machine Translation without Knowledge Distillation

Huang, Chenyang, Huang, Fei, Zheng, Zaixiang, Zaïane, Osmar R., Zhou, Hao, Mou, Lili

Multilingual neural machine translation (MNMT) aims at using one single model for multiple translation directions. Recent work applies non-autoregressive Transformers to improve the efficiency of MNMT, but requires expensive knowledge distillation (KD) processes. To this end, we propose an M-DAT approach to non-autoregressive multilingual machine translation. Our system leverages the recent advance of the directed acyclic Transformer (DAT), which does not require KD. We further propose a pivot back-translation (PivotBT) approach to improve the generalization to unseen translation directions. Experiments show that our M-DAT achieves state-of-the-art performance in non-autoregressive MNMT.

artificial intelligence, natural language, translation, (16 more...)

2502.04537

Country:

North America > Canada > Alberta (0.14)
Asia > China (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)