AITopics

2307.09701

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(15 more...)

Genre:

Research Report (1.00)
Overview (0.66)

Industry: Energy (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Multilingual Speech-to-Speech Translation into Multiple Target Languages

Gong, Hongyu, Dong, Ning, Popuri, Sravya, Goswami, Vedanuj, Lee, Ann, Pino, Juan

Speech-to-speech translation (S2ST) enables spoken communication between people talking in different languages. Despite a few studies on multilingual S2ST, their focus is the multilinguality on the source side, i.e., the translation from multiple source languages to one target language. We present the first work on multilingual S2ST supporting multiple target languages. Leveraging recent advance in direct S2ST with speech-to-unit and vocoder, we equip these key components with multilingual capability. Speech-to-masked-unit (S2MU) is the multilingual extension of S2U, which applies masking to units which don't belong to the given target language to reduce the language interference. We also propose multilingual vocoder which is trained with language embedding and the auxiliary loss of language identification. On benchmark translation testsets, our proposed multilingual model shows superior performance than bilingual models in the translation from English into $16$ target languages.

artificial intelligence, machine learning, natural language, (17 more...)

2307.08655

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > South Korea > Incheon > Incheon (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Syntax-Aware Complex-Valued Neural Machine Translation

Liu, Yang, Hou, Yuexian

Syntax has been proven to be remarkably effective in neural machine translation (NMT). Previous models obtained syntax information from syntactic parsing tools and integrated it into NMT models to improve translation performance. In this work, we propose a method to incorporate syntax information into a complex-valued Encoder-Decoder architecture. The proposed model jointly learns word-level and syntax-level attention scores from the source side to the target side using an attention mechanism. Importantly, it is not dependent on specific network architectures and can be directly integrated into any existing sequence-to-sequence (Seq2Seq) framework. The experimental results demonstrate that the proposed method can bring significant improvements in BLEU scores on two datasets. In particular, the proposed method achieves a greater improvement in BLEU scores in translation tasks involving language pairs with significant syntactic differences.

computational linguistic, machine learning, natural language, (16 more...)

2307.08586

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
(11 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Hubert, Rebekka, Sokolov, Artem, Riezler, Stefan

Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts

End-to-end automatic speech translation (AST) relies on data that combines audio inputs with text translation outputs. Previous work used existing large parallel corpora of transcriptions and translations in a knowledge distillation (KD) setup to distill a neural machine translation (NMT) into an AST student model. While KD allows using larger pretrained models, the reliance of previous KD approaches on manual audio transcripts in the data pipeline restricts the applicability of this framework to AST. We present an imitation learning approach where a teacher NMT system corrects the errors of an AST student without relying on manual transcripts. We show that the NMT teacher can recover from errors in automatic transcriptions and is able to correct erroneous translations of the AST student, leading to improvements of about 4 BLEU points over the standard AST end-to-end baseline on the English-German CoVoST-2 and MuST-C datasets, respectively. Code and data are publicly available.\footnote{\url{https://github.com/HubReb/imitkd_ast/releases/tag/v1.1}}

machine learning, natural language, transcript, (18 more...)

2307.08426

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Berlin (0.04)
North America > United States > Texas (0.04)
(13 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Berger, Nathaniel, Exel, Miriam, Huck, Matthias, Riezler, Stefan

Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training

Supervised learning in Neural Machine Translation (NMT) typically follows a teacher forcing paradigm where reference tokens constitute the conditioning context in the model's prediction, instead of its own previous predictions. In order to alleviate this lack of exploration in the space of translations, we present a simple extension of standard maximum likelihood estimation by a contrastive marking objective. The additional training signals are extracted automatically from reference translations by comparing the system hypothesis against the reference, and used for up/down-weighting correct/incorrect tokens. The proposed new training procedure requires one additional translation pass over the training set per epoch, and does not alter the standard inference setup. We show that training with contrastive markings yields improvements on top of supervised learning, and is especially useful when learning from postedits where contrastive markings indicate human error corrections to the original hypotheses. Code is publicly released.

machine learning, natural language, translation, (16 more...)

2307.08416

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
Europe > Germany > Berlin (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(18 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation

Ma, Zhengrui, Shao, Chenze, Gui, Shangtong, Zhang, Min, Feng, Yang

Non-autoregressive translation (NAT) reduces the decoding latency but suffers from performance degradation due to the multi-modality problem. Recently, the structure of directed acyclic graph has achieved great success in NAT, which tackles the multi-modality problem by introducing dependency between vertices. However, training it with negative log-likelihood loss implicitly requires a strict alignment between reference tokens and vertices, weakening its ability to handle multiple translation modalities. In this paper, we hold the view that all paths in the graph are fuzzily aligned with the reference sentence. We do not require the exact alignment but train the model to maximize a fuzzy alignment score between the graph and reference, which takes captured translations in all modalities into account. Extensive experiments on major WMT benchmarks show that our method substantially improves translation performance and increases prediction confidence, setting a new state of the art for NAT on the raw training data.

machine learning, natural language, translation, (18 more...)

2303.06662

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(16 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJul-14-2023

Gloss Attention for Gloss-free Sign Language Translation

Yin, Aoxiong, Zhong, Tianyun, Tang, Li, Jin, Weike, Jin, Tao, Zhao, Zhou

Most sign language translation (SLT) methods to date require the use of gloss annotations to provide additional supervision information, however, the acquisition of gloss is not easy. To solve this problem, we first perform an analysis of existing models to confirm how gloss annotations make SLT easier. We find that it can provide two aspects of information for the model, 1) it can help the model implicitly learn the location of semantic boundaries in continuous sign language videos, 2) it can help the model understand the sign language video globally. We then propose \emph{gloss attention}, which enables the model to keep its attention within video segments that have the same semantics locally, just as gloss helps existing models do. Furthermore, we transfer the knowledge of sentence-to-sentence similarity from the natural language model to our gloss attention SLT network (GASLT) to help it understand sign language videos at the sentence level. Experimental results on multiple large-scale sign language datasets show that our proposed GASLT model significantly outperforms existing methods. Our code is provided in \url{https://github.com/YinAoXiong/GASLT}.

artificial intelligence, machine learning, natural language, (15 more...)

2307.07361

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(14 more...)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-13-2023

MegaWika: Millions of reports and their sources across 50 diverse languages

Barham, Samuel, Weller, Orion, Yuan, Michelle, Murray, Kenton, Yarmohammadi, Mahsa, Jiang, Zhengping, Vashishtha, Siddharth, Martin, Alexander, Liu, Anqi, White, Aaron Steven, Boyd-Graber, Jordan, Van Durme, Benjamin

To foster the development of new models for collaborative AI-assisted report generation, we introduce MegaWika, consisting of 13 million Wikipedia articles in 50 diverse languages, along with their 71 million referenced source materials. We process this dataset for a myriad of applications, going beyond the initial Wikipedia citation extraction and web scraping of content, including translating non-English articles for cross-lingual applications and providing FrameNet parses for automated semantic analysis. MegaWika is the largest resource for sentence-level report generation and the only report generation dataset that is multilingual. We manually analyze the quality of this resource through a semantically stratified sample. Finally, we provide baseline results and trained models for crucial steps in automated report generation: cross-lingual question answering and citation retrieval.

computational linguistic, machine learning, question answering, (20 more...)

2307.07049

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Philippines > Mindanao > Bangsamoro > Province of Maguindanao del Norte > City of Cotabato (0.05)
Africa > Togo > Maritime Region > Lome (0.04)
(8 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)

arXiv.org Artificial IntelligenceJul-13-2023

Data Augmentation for Machine Translation via Dependency Subtree Swapping

Nagy, Attila, Lakatos, Dorina Petra, Barta, Botond, Nanys, Patrick, Ács, Judit

We present a generic framework for data augmentation via dependency subtree swapping that is applicable to machine translation. We extract corresponding subtrees from the dependency parse trees of the source and target sentences and swap these across bisentences to create augmented samples. We perform thorough filtering based on graphbased similarities of the dependency trees and additional heuristics to ensure that extracted subtrees correspond to the same meaning. We conduct resource-constrained experiments on 4 language pairs in both directions using the IWSLT text translation datasets and the Hunglish2 corpus. The results demonstrate consistent improvements in BLEU score over our baseline models in 3 out of 4 language pairs. Our code is available on GitHub.

artificial intelligence, machine translation, natural language, (15 more...)

2307.07025

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(13 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Dinh, Tu Anh, Niehues, Jan

Perturbation-based QE: An Explainable, Unsupervised Word-level Quality Estimation Method for Blackbox Machine Translation

arXiv.org Artificial IntelligenceJul-13-2023

Quality Estimation (QE) is the task of predicting the quality of Machine Translation (MT) system output, without using any gold-standard translation references. State-of-the-art QE models are supervised: they require human-labeled quality of some MT system output on some datasets for training, making them domain-dependent and MT-system-dependent. There has been research on unsupervised QE, which requires glass-box access to the MT systems, or parallel MT data to generate synthetic errors for training QE models. In this paper, we present Perturbation-based QE - a word-level Quality Estimation approach that works simply by analyzing MT system output on perturbed input source sentences. Our approach is unsupervised, explainable, and can evaluate any type of blackbox MT systems, including the currently prominent large language models (LLMs) with opaque internal processes. For language directions with no labeled QE data, our approach has similar or better performance than the zero-shot supervised approach on the WMT21 shared task. Our approach is better at detecting gender bias and word-sense-disambiguation errors in translation than supervised QE, indicating its robustness to out-of-domain usage. The performance gap is larger when detecting errors on a nontraditional translation-prompting LLM, indicating that our approach is more generalizable to different MT systems. We give examples demonstrating our approach's explainability power, where it shows which input source words have influence on a certain MT output word.

artificial intelligence, machine translation, natural language, (14 more...)

2305.07457

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > Canada > Quebec > Montreal (0.04)
(10 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)