AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Grammar Accuracy Evaluation (GAE): Quantifiable Quantitative Evaluation of Machine Translation Models

Park, Dojun, Jang, Youngjin, Kim, Harksoo

arXiv.org Artificial IntelligenceMay-27-2022

Natural Language Generation (NLG) refers to the operation of expressing the calculation results of a system in human language. Since the quality of generated sentences from an NLG model cannot be fully represented using only quantitative evaluation, they are evaluated using qualitative evaluation by humans in which the meaning or grammar of a sentence is scored according to a subjective criterion. Nevertheless, the existing evaluation methods have a problem as a large score deviation occurs depending on the criteria of evaluators. In this paper, we propose Grammar Accuracy Evaluation (GAE) that can provide the specific evaluating criteria. As a result of analyzing the quality of machine translation by BLEU and GAE, it was confirmed that the BLEU score does not represent the absolute performance of machine translation models and GAE compensates for the shortcomings of BLEU with flexible evaluation of alternative synonyms and changes in sentence structure.

evaluation, grammar accuracy evaluation, quantifiable quantitative evaluation, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.5626/jok.2022.49.7.514

2105.14277

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Google Translate Provides Assist for Extra Indian Languages - Channel969

#artificialintelligenceMay-24-2022, 11:01:20 GMT

Google Translate has added help for some extra Indian languages. Whereas Hindi has been supported by Google Translate for an extended now, a number of new regional languages have been added to the platform by Google. Languages together with Assamese, a outstanding one in Northeast India; Bhojpuri, Dhivehi (used within the Maldives), Dogri (Northern India), Konkani (central India), Maithili (about 34 million folks in Northern India communicate this language), Meiteilon or Manipuri, utilized by about two million folks in Northeast India, Mizo, and Sanskrit have been added to the platform. Together with these languages, Google Translate has additionally added help for a number of worldwide languages. Now, Google Translate helps over 133 languages spoken internationally, protecting main Indian languages as properly.

google translate, google translate provide assist, indian language, (5 more...)

#artificialintelligence

Country:

Asia > India (1.00)
Asia > Maldives (0.27)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Semantics-aware Attention Improves Neural Machine Translation

Slobodkin, Aviv, Choshen, Leshem, Abend, Omri

arXiv.org Artificial IntelligenceMay-24-2022

The integration of syntactic structures into Transformer machine translation has shown positive results, but to our knowledge, no work has attempted to do so with semantic structures. In this work we propose two novel parameter-free methods for injecting semantic information into Transformers, both rely on semantics-aware masking of (some of) the attention heads. One such method operates on the encoder, through a Scene-Aware Self-Attention (SASA) head. Another on the decoder, through a Scene-Aware Cross-Attention (SACrA) head. We show a consistent improvement over the vanilla Transformer and syntax-aware models for four language pairs. We further show an additional gain when using both semantic and syntactic structures in some language pairs.

neural machine translation

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.starsem-1.3

2110.0692

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation

Khurana, Sameer, Laurent, Antoine, Glass, James

arXiv.org Artificial IntelligenceMay-17-2022

We propose the SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation learning framework. Unlike previous works on speech representation learning, which learns multilingual contextual speech embedding at the resolution of an acoustic frame (10-20ms), this work focuses on learning multimodal (speech-text) multilingual speech embedding at the resolution of a sentence (5-10s) such that the embedding vector space is semantically aligned across different languages. We combine state-of-the-art multilingual acoustic frame-level speech representation learning model XLS-R with the Language Agnostic BERT Sentence Embedding (LaBSE) model to create an utterance-level multimodal multilingual speech encoder SAMU-XLSR. Although we train SAMU-XLSR with only multilingual transcribed speech data, cross-lingual speech-text and speech-speech associations emerge in its learned representation space. To substantiate our claims, we use SAMU-XLSR speech encoder in combination with a pre-trained LaBSE text sentence encoder for cross-lingual speech-to-text translation retrieval, and SAMU-XLSR alone for cross-lingual speech-to-speech translation retrieval. We highlight these applications by performing several cross-lingual text and speech translation retrieval tasks across several datasets.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JSTSP.2022.3192714

2205.0818

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Washington > Okanogan County (0.04)
(5 more...)

Genre: Research Report (0.42)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.35)

Add feedback

Google Translate adds 24 new languages

BBC NewsMay-12-2022, 00:48:05 GMT

"For many supported languages, even the largest languages in Africa that we have supported - say like Yoruba, Igbo, the translation is not great. It will definitely get the idea across but often it will lose much of the subtlety of the language," Google Translate research scientist Isaac Caswell told the BBC.

new language

BBC News

Country: Africa (0.43)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.79)

Add feedback

Google Translate adds support for 24 new languages

EngadgetMay-11-2022, 17:18:10 GMT

Google is adding support for 24 new languages to its Translate tool, the company announced today during its I/O 2022 developer conference. Among the newly available languages are Sanskrit, Tsongae and Sorani Kurdish. One of the new additions, Assamese, is used by approximately 25 million people in Northeast India. Another, Dhivehi, is spoken by about 300,000 people in the Maldives. According to Google CEO Sundar Pichai, the expansion allows the company to cover languages spoken by more than 300 million people and brings the total number of languages supported by Translate to 133.

google, new language, pichai

Engadget

Country:

Asia > Maldives (0.28)
Asia > India (0.28)

Genre: Press Release (0.86)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.40)

Add feedback

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

Gaido, Marco, Papi, Sara, Fucci, Dennis, Fiameni, Giuseppe, Negri, Matteo, Turchi, Marco

arXiv.org Artificial IntelligenceMay-5-2022

The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality. As such, we first question the need of ASR pre-training, showing that it is not essential to achieve competitive results. Second, we focus on data filtering, showing that a simple method that looks at the ratio between source and target characters yields a quality improvement of 1 BLEU. Third, we compare different methods to reduce the detrimental effect of the audio segmentation mismatch between training data manually segmented at sentence level and inference data that is automatically segmented. Towards the same goal of training cost reduction, we participate in the simultaneous task with the same model trained for offline ST. The effectiveness of our lightweight training strategy is shown by the high score obtained on the MuST-C en-de corpus (26.7 BLEU) and is confirmed in high-resource data conditions by a 1.6 BLEU improvement on the IWSLT2020 test set over last year's winning system.

artificial intelligence, competitive speech translation, natural language, (4 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.iwslt-1.13

2205.02629

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.89)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.60)

Add feedback

Learn To Remember: Transformer with Recurrent Memory for Document-Level Machine Translation

Feng, Yukun, Li, Feng, Song, Ziang, Zheng, Boyuan, Koehn, Philipp

arXiv.org Artificial IntelligenceMay-3-2022

The Transformer architecture has led to significant gains in machine translation. However, most studies focus on only sentence-level translation without considering the context dependency within documents, leading to the inadequacy of document-level coherence. Some recent research tried to mitigate this issue by introducing an additional context encoder or translating with multiple sentences or even the entire document. Such methods may lose the information on the target side or have an increasing computational complexity as documents get longer. To address such problems, we introduce a recurrent memory unit to the vanilla Transformer, which supports the information exchange between the sentence and previous context. The memory unit is recurrently updated by acquiring information from sentences, and passing the aggregated knowledge back to subsequent sentence states. We follow a two-stage training strategy, in which the model is first trained at the sentence level and then finetuned for document-level translation. We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0.91 s-BLEU over the sentence-level baseline. We also achieve state-of-the-art results on TED and News, outperforming the previous work by 0.36 s-BLEU and 1.49 d-BLEU on average.

machine learning, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.findings-naacl.105

2205.01546

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(14 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Council Post: Automation Is Here: Ways AI And ML Are Transforming Digital Publishing

#artificialintelligenceMay-2-2022, 06:40:07 GMT

According to Statista, digital publishing generates worldwide revenue of $22.05 billion. Globally, countries that have access to digital media have witnessed a sharp rise in its popularity. However, with global accessibility comes the challenge of producing high-quality content consistently in large volumes. Additionally, with the rise in voice-based and image searches, content discoverability is the need of the hour. Artificial intelligence (AI) can help in this endeavor.

machine learning, transforming digital publishing, translation, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.73)

Add feedback

FFCI: A Framework for Interpretable Automatic Evaluation of Summarization

Koto, Fajri (University of Melbourne) | Baldwin, Timothy (University of Melbourne) | Lau, Jey Han (University of Melbourne)

Journal of Artificial Intelligence ResearchApr-29-2022

In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences). We construct a novel dataset for focus, coverage, and inter-sentential coherence, and develop automatic methods for evaluating each of the four dimensions of FFCI based on cross-comparison of evaluation metrics and model-based evaluation methods, including question answering (QA) approaches, semantic textual similarity (STS), next-sentence prediction (NSP), and scores derived from 19 pre-trained language models. We then apply the developed metrics in evaluating a broad range of summarization models across two datasets, with some surprising findings.

computational linguistic, linguistic, proceedings, (11 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13167

AI Access Foundation

13167

Journal of Artificial Intelligence Research

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.14)
Europe > Italy > Tuscany > Florence (0.05)
(23 more...)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback