AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Shared Latent Space by Both Languages in Non-Autoregressive Neural Machine Translation

Heo, DongNyeong, Choi, Heeyoul

arXiv.org Artificial IntelligenceMay-2-2023

Latent variable modeling in non-autoregressive neural machine translation (NAT) is a promising approach to mitigate the multimodality problem. In the previous works, they added an auxiliary model to estimate the posterior distribution of the latent variable conditioned on the source and target sentences. However, it causes several disadvantages, such as redundant information extraction in the latent variable, increasing parameters, and a tendency to ignore a part of the information from the inputs. In this paper, we propose a new latent variable modeling that is based on a dual reconstruction perspective and an advanced hierarchical latent modeling approach. Our proposed method, {\em LadderNMT}, shares a latent space across both languages so that it hypothetically alleviates or solves the above disadvantages. Experimental results quantitatively and qualitatively demonstrate that our proposed latent variable modeling learns an advantageous latent space and significantly improves translation quality in WMT translation tasks.

artificial intelligence, machine translation, natural language, (13 more...)

arXiv.org Artificial Intelligence

2305.03511

Country:

North America > Canada > Quebec (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SLTUNET: A Simple Unified Model for Sign Language Translation

Zhang, Biao, Müller, Mathias, Sennrich, Rico

arXiv.org Artificial IntelligenceMay-2-2023

Despite recent successes with neural models for sign language translation (SLT), translation quality still lags behind spoken languages because of the data scarcity and modality gap between sign video and text. To address both problems, we investigate strategies for cross-modality representation sharing for SLT. We propose SLTUNET, a simple unified neural model designed to support multiple SLTrelated tasks jointly, such as sign-to-gloss, gloss-to-text and sign-to-text translation. Jointly modeling different tasks endows SLTUNET with the capability to explore the cross-task relatedness that could help narrow the modality gap. In addition, this allows us to leverage the knowledge from external resources, such as abundant parallel data used for spoken-language machine translation (MT). We show in experiments that SLTUNET achieves competitive and even state-of-the-art performance on PHOENIX-2014T and CSL-Daily when augmented with MT data and equipped with a set of optimization techniques. We further use the DGS Corpus for end-to-end SLT for the first time. It covers broader domains with a significantly larger vocabulary, which is more challenging and which we consider to allow for a more realistic assessment of the current state of SLT than the former two. Still, SLTUNET obtains improved results on the DGS Corpus. Code is available at https://github.com/bzhangGo/sltunet.

artificial intelligence, natural language, translation, (15 more...)

arXiv.org Artificial Intelligence

2305.01778

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.68)
Education > Curriculum > Subject-Specific Education (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)

Muhammad, Shamsuddeen Hassan, Abdulmumin, Idris, Yimam, Seid Muhie, Adelani, David Ifeoluwa, Ahmad, Ibrahim Sa'id, Ousidhoum, Nedjma, Ayele, Abinew, Mohammad, Saif M., Beloucif, Meriem, Ruder, Sebastian

arXiv.org Artificial IntelligenceMay-1-2023

We present the first Africentric SemEval Shared task, Sentiment Analysis for African Languages (AfriSenti-SemEval) - The dataset is available at https://github.com/afrisenti-semeval/afrisent-semeval-2023. AfriSenti-SemEval is a sentiment classification challenge in 14 African languages: Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yor\`ub\'a (Muhammad et al., 2023), using data labeled with 3 sentiment classes. We present three subtasks: (1) Task A: monolingual classification, which received 44 submissions; (2) Task B: multilingual classification, which received 32 submissions; and (3) Task C: zero-shot classification, which received 34 submissions. The best performance for tasks A and B was achieved by NLNDE team with 71.31 and 75.06 weighted F1, respectively. UCAS-IIE-NLP achieved the best average score for task C with 58.15 weighted F1. We describe the various approaches adopted by the top 10 systems and their approaches.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2304.06845

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.06)
(48 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Comprehensive Evaluation of the Copy Mechanism for Natural Language to SPARQL Query Generation

Reyd, Samuel, Zouaq, Amal, Diallo, Papa Abdou Karim Karou

arXiv.org Artificial IntelligenceMay-1-2023

In recent years, the field of neural machine translation (NMT) for SPARQL query generation has witnessed a significant growth. Recently, the incorporation of the copy mechanism with traditional encoder-decoder architectures and the use of pre-trained encoder-decoders have set new performance benchmarks. This paper presents a large variety of experiments that replicate and expand upon recent NMT-based SPARQL generation studies, comparing pre-trained and non-pre-trained models, question annotation formats, and the use of a copy mechanism for non-pre-trained and pre-trained models. Our results show that either adding the copy mechanism or using a question annotation improves performances for nonpre-trained models and for pre-trained models, setting new baselines for three popular datasets.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2304.07772

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Europe > Austria > Vienna (0.14)
North America > Canada > Quebec > Montreal (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.88)
Information Technology > Communications > Web > Semantic Web (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text

Gaser, Marwa, Mager, Manuel, Hamed, Injy, Habash, Nizar, Abdennadher, Slim, Vu, Ngoc Thang

arXiv.org Artificial IntelligenceApr-30-2023

Data sparsity is one of the main challenges posed by code-switching (CS), which is further exacerbated in the case of morphologically rich languages. For the task of machine translation (MT), morphological segmentation has proven successful in alleviating data sparsity in monolingual contexts; however, it has not been investigated for CS settings. In this paper, we study the effectiveness of different segmentation approaches on MT performance, covering morphology-based and frequency-based segmentation techniques. We experiment on MT from code-switched Arabic-English to English. We provide detailed analysis, examining a variety of conditions, such as data size and sentences with different degrees of CS. Empirical results show that morphology-aware segmenters perform the best in segmentation tasks but under-perform in MT. Nevertheless, we find that the choice of the segmentation setup to use for MT is highly dependent on the data size. For extreme low-resource scenarios, a combination of frequency and morphology-based segmentations is shown to perform the best. For more resourced settings, such a combination does not bring significant improvements over the use of frequency-based segmentation.

machine learning, natural language, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2210.0699

Country:

Europe (1.00)
Asia (1.00)
Africa > Middle East > Egypt (0.28)
North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Low-Resourced Machine Translation for Senegalese Wolof Language

Mbaye, Derguene, Diallo, Moussa, Diop, Thierno Ibrahima

arXiv.org Artificial IntelligenceApr-30-2023

Natural Language Processing (NLP) research has made great advancements in recent years with major breakthroughs that have established new benchmarks. However, these advances have mainly benefited a certain group of languages commonly referred to as resource-rich such as English and French. Majority of other languages with weaker resources are then left behind which is the case for most African languages including Wolof. In this work, we present a parallel Wolof/French corpus of 123,000 sentences on which we conducted experiments on machine translation models based on Recurrent Neural Networks (RNN) in different data configurations. We noted performance gains with the models trained on subworded data as well as those trained on the French-English language pair compared to those trained on the French-Wolof pair under the same experimental conditions.

computational linguistic, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.00606

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
Africa > Senegal > Dakar Region > Dakar (0.05)
Europe > France > Île-de-France > Paris > Paris (0.04)
(15 more...)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Multilingual Representation Distillation with Contrastive Learning

Tan, Weiting, Heffernan, Kevin, Schwenk, Holger, Koehn, Philipp

arXiv.org Artificial IntelligenceApr-30-2023

Multilingual sentence representations from large models encode semantic information from two or more languages and can be used for different cross-lingual information retrieval and matching tasks. In this paper, we integrate contrastive learning into multilingual representation distillation and use it for quality estimation of parallel sentences (i.e., find semantically similar sentences that can be used as translations of each other). We validate our approach with multilingual similarity search and corpus filtering tasks. Experiments across different low-resource languages show that our method greatly outperforms previous sentence encoders such as LASER, LASER3, and LaBSE.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2210.05033

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Oceania > Australia (0.04)
(11 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

Grammatical Error Correction: A Survey of the State of the Art

Bryant, Christopher, Yuan, Zheng, Qorib, Muhammad Reza, Cao, Hannan, Ng, Hwee Tou, Briscoe, Ted

arXiv.org Artificial IntelligenceApr-29-2023

Grammatical Error Correction (GEC) is the task of automatically detecting and correcting errors in text. The task not only includes the correction of grammatical errors, such as missing prepositions and mismatched subject-verb agreement, but also orthographic and semantic errors, such as misspellings and word choice errors respectively. The field has seen significant progress in the last decade, motivated in part by a series of five shared tasks, which drove the development of rule-based methods, statistical classifiers, statistical machine translation, and finally neural machine translation systems which represent the current dominant state of the art. In this survey paper, we condense the field into a single article and first outline some of the linguistic challenges of the task, introduce the most popular datasets that are available to researchers (for both English and other languages), and summarise the various methods and techniques that have been developed with a particular focus on artificial error generation. We next describe the many different approaches to evaluation as well as concerns surrounding metric reliability, especially in relation to subjective human judgements, before concluding with an overview of recent progress and suggestions for future work and remaining challenges. We hope that this survey will serve as comprehensive resource for researchers who are new to the field or who want to be kept apprised of recent developments.

correction, error correction, proceedings, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1162/coli_a_00478

2211.05166

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Singapore (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry:

Education > Educational Setting (0.92)
Education > Educational Technology > Educational Software (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

ResiDual: Transformer with Dual Residual Connections

Xie, Shufang, Zhang, Huishuai, Guo, Junliang, Tan, Xu, Bian, Jiang, Awadalla, Hany Hassan, Menezes, Arul, Qin, Tao, Yan, Rui

arXiv.org Artificial IntelligenceApr-28-2023

Transformer networks have become the preferred architecture for many tasks due to their state-of-the-art performance. However, the optimal way to implement residual connections in Transformer, which are essential for effective training, is still debated. Two widely used variants are the Post-Layer-Normalization (Post-LN) and Pre-Layer-Normalization (Pre-LN) Transformers, which apply layer normalization after each residual block's output or before each residual block's input, respectively. While both variants enjoy their advantages, they also suffer from severe limitations: Post-LN causes gradient vanishing issue that hinders training deep Transformers, and Pre-LN causes representation collapse issue that limits model capacity. In this paper, we propose ResiDual, a novel Transformer architecture with Pre-Post-LN (PPLN), which fuses the connections in Post-LN and Pre-LN together and inherits their advantages while avoids their limitations. We conduct both theoretical analyses and empirical experiments to verify the effectiveness of ResiDual. Theoretically, we prove that ResiDual has a lower bound on the gradient to avoid the vanishing issue due to the residual connection from Pre-LN. Moreover, ResiDual also has diverse model representations to avoid the collapse issue due to the residual connection from Post-LN. Empirically, ResiDual outperforms both Post-LN and Pre-LN on several machine translation benchmarks across different network depths and data sizes. Thanks to the good theoretical and empirical performance, ResiDual Transformer can serve as a foundation architecture for different AI models (e.g., large language models). Our code is available at https://github.com/microsoft/ResiDual.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2304.14802

Country:

Europe > Germany > Berlin (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Convolution-enhanced Evolving Attention Networks

Wang, Yujing, Yang, Yaming, Li, Zhuo, Bai, Jiangang, Zhang, Mingliang, Li, Xiangtai, Yu, Jing, Zhang, Ce, Huang, Gao, Tong, Yunhai

arXiv.org Artificial IntelligenceApr-28-2023

Attention-based neural networks, such as Transformers, have become ubiquitous in numerous applications, including computer vision, natural language processing, and time-series analysis. In all kinds of attention networks, the attention maps are crucial as they encode semantic dependencies between input tokens. However, most existing attention networks perform modeling or reasoning based on representations , wherein the attention maps of different layers are learned separately without explicit interactions. In this paper, we propose a novel and generic evolving attention mechanism, which directly models the evolution of inter-token relationships through a chain of residual convolutional modules. The major motivations are twofold. On the one hand, the attention maps in different layers share transferable knowledge, thus adding a residual connection can facilitate the information flow of inter-token relationships across layers. On the other hand, there is naturally an evolutionary trend among attention maps at different abstraction levels, so it is beneficial to exploit a dedicated convolution-based module to capture this process. Equipped with the proposed mechanism, the convolution-enhanced evolving attention networks achieve superior performance in various applications, including time-series representation, natural language understanding, machine translation, and image classification. Especially on time-series representation tasks, Evolving Attention-enhanced Dilated Convolutional (EA-DC-) Transformer outperforms state-of-the-art models significantly, achieving an average of 17% improvement compared to the best SOTA. To the best of our knowledge, this is the first work that explicitly models the layer-wise evolution of attention maps. Our implementation is available at https://github.com/pkuyym/EvolvingAttention.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TPAMI.2023.3236725

2212.0833

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.89)

Add feedback