AITopics

doi: 10.1145/3564271

2203.10929

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > Japan > Honshū > Chūbu > Aichi Prefecture > Nagoya (0.04)
Asia > China > Beijing > Beijing (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceDec-7-2022

Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information

Ding, Fenglin, Wan, Genshun, Li, Pengcheng, Pan, Jia, Liu, Cong

Multilingual end-to-end models have shown great improvement over monolingual systems. With the development of pre-training methods on speech, self-supervised multilingual speech representation learning like XLSR has shown success in improving the performance of multilingual automatic speech recognition (ASR). However, similar to the supervised learning, multilingual pre-training may also suffer from language interference and further affect the application of multilingual system. In this paper, we introduce several techniques for improving self-supervised multilingual pre-training by leveraging auxiliary language information, including the language adversarial training, language embedding and language adaptive training during the pre-training stage. We conduct experiments on a multilingual ASR task consisting of 16 languages. Our experimental results demonstrate 14.3% relative gain over the standard XLSR model, and 19.8% relative gain over the no pre-training multilingual model.

artificial intelligence, machine learning, natural language, (16 more...)

2212.03476

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.72)

arXiv.org Artificial IntelligenceDec-7-2022

M3ST: Mix at Three Levels for Speech Translation

Cheng, Xuxin, Dong, Qianqian, Yue, Fengpeng, Ko, Tom, Wang, Mingxuan, Zou, Yuexian

How to solve the data scarcity problem for end-to-end speech-to-text translation (ST)? It's well known that data augmentation is an efficient method to improve performance for many tasks by enlarging the dataset. In this paper, we propose Mix at three levels for Speech Translation (M^3ST) method to increase the diversity of the augmented training corpus. Specifically, we conduct two phases of fine-tuning based on a pre-trained model using external machine translation (MT) data. In the first stage of fine-tuning, we mix the training corpus at three levels, including word level, sentence level and frame level, and fine-tune the entire model with mixed data. At the second stage of fine-tuning, we take both original speech sequences and original text sequences in parallel into the model to fine-tune the network, and use Jensen-Shannon divergence to regularize their outputs. Experiments on MuST-C speech translation benchmark and analysis show that M^3ST outperforms current strong baselines and achieves state-of-the-art results on eight directions with an average BLEU of 29.9.

artificial intelligence, natural language, translation, (18 more...)

2212.03657

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Amrhein, Chantal, Moghe, Nikita, Guillou, Liane

ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics

As machine translation (MT) metrics improve their correlation with human judgement every year, it is crucial to understand the limitations of such metrics at the segment level. Specifically, it is important to investigate metric behaviour when facing accuracy errors in MT because these can have dangerous consequences in certain contexts (e.g., legal, medical). We curate ACES, a translation accuracy challenge set, consisting of 68 phenomena ranging from simple perturbations at the word/character level to more complex errors based on discourse and real-world knowledge. We use ACES to evaluate a wide range of MT metrics including the submissions to the WMT 2022 metrics shared task and perform several analyses leading to general recommendations for metric developers. We recommend: a) combining metrics with different strengths, b) developing metrics that give more weight to the source and less to surface-level overlap with the reference and c) explicitly modelling additional language-specific information beyond what is available via multilingual embeddings.

artificial intelligence, natural language, translation, (16 more...)

2210.15615

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(24 more...)

Genre: Research Report (0.81)

Industry: Law (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation

Zhao, Yang, Zhu, Junnan, Xiang, Lu, Zhang, Jiajun, Zhou, Yu, Zhai, Feifei, Zong, Chengqing

A common scenario of Multilingual Neural Machine Translation (MNMT) is that each translation task arrives in a sequential manner, and the training data of previous tasks is unavailable. In this scenario, the current methods suffer heavily from catastrophic forgetting (CF). To alleviate the CF, we investigate knowledge distillation based life-long learning methods. Specifically, in one-tomany scenario, we propose a multilingual distillation method to make the new model (student) jointly learn multilingual output from old model (teacher) and new task. In many-to one scenario, we find that direct distillation faces the extreme partial distillation problem, and we propose two different methods to address it: pseudo input distillation and reverse teacher distillation. The experimental results on twelve translation tasks show that the proposed methods can better consolidate the previous knowledge and sharply alleviate the CF.

artificial intelligence, distillation, natural language, (16 more...)

2212.028

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > Italy (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.65)

Industry: Education > Educational Setting > Continuing Education (0.85)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Behre, Piyush, Tan, Sharman, Varadharajan, Padma, Chang, Shuangyu

Streaming Punctuation for Long-form Dictation with Transformers

While speech recognition Word Error Rate (WER) has reached human parity for English, longform dictation scenarios still suffer from segmentation and punctuation problems resulting from irregular pausing patterns or slow speakers. Transformer sequence tagging models are effective at capturing long bi-directional context, which is crucial for automatic punctuation. Automatic Speech Recognition (ASR) production systems, however, are constrained by real-time requirements, making it hard to incorporate the right context when making punctuation decisions. In this paper, we propose a streaming approach for punctuation or re-punctuation of ASR output using dynamic decoding windows and measure its impact on punctuation and segmentation accuracy across scenarios. Streaming punctuation achieves an average BLEU-score improvement of 0.66 for the downstream task of Machine Translation (MT). NTRODUCTION Our hybrid Automatic Speech Recognition (ASR) generates punctuation with two systems working together.

artificial intelligence, machine learning, natural language, (15 more...)

2210.05756

Country: North America > United States > New York (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Machine Translation with Contrastive Translation Memories

Cheng, Xin, Gao, Shen, Liu, Lemao, Zhao, Dongyan, Yan, Rui

Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios. Different from previous works that make use of mutually similar but redundant translation memories~(TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gains in three phases. First, in TM retrieval phase, we adopt a contrastive retrieval algorithm to avoid redundancy and uninformativeness of similar translation pieces. Second, in memory encoding stage, given a set of TMs we propose a novel Hierarchical Group Attention module to gather both local context of each TM and global context of the whole TM set. Finally, in training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence. Experimental results show that our framework obtains improvements over strong baselines on the benchmark datasets.

artificial intelligence, natural language, translation memory, (15 more...)

2212.0314

Country:

Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
Asia > South Korea (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Armstrong, Ruth-Ann, Hewitt, John, Manning, Christopher

JamPatoisNLI: A Jamaican Patois Natural Language Inference Dataset

JamPatoisNLI provides the first dataset for natural language inference in a creole language, Jamaican Patois. Many of the most-spoken low-resource languages are creoles. These languages commonly have a lexicon derived from a major world language and a distinctive grammar reflecting the languages of the original speakers and the process of language birth by creolization. This gives them a distinctive place in exploring the effectiveness of transfer from large monolingual or multilingual pretrained models. While our work, along with previous work, shows that transfer from these models to low-resource languages that are unrelated to languages in their training set is not very effective, we would expect stronger results from transfer to creoles. Indeed, our experiments show considerably better results from few-shot learning of JamPatoisNLI than for such unrelated languages, and help us begin to understand how the unique relationship between creoles and their high-resource base languages affect cross-lingual transfer. JamPatoisNLI, which consists of naturally-occurring premises and expert-written hypotheses, is a step towards steering research into a traditionally underserved language and a useful benchmark for understanding cross-lingual NLP.

artificial intelligence, machine learning, natural language, (19 more...)

2212.03419

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Africa > Nigeria > Jigawa State > Dutse (0.04)
Pacific Ocean (0.04)
(9 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Ananthanarayana, Tejaswini, Chaudhary, Lipisha, Nwogu, Ifeoma

SignNet: Single Channel Sign Generation using Metric Embedded Learning

A true interpreting agent not only understands sign language and translates to text, but also understands text and translates to signs. Much of the AI work in sign language translation to date has focused mainly on translating from signs to text. Towards the latter goal, we propose a text-to-sign translation model, SignNet, which exploits the notion of similarity (and dissimilarity) of visual signs in translating. This module presented is only one part of a dual-learning two task process involving text-to-sign (T2S) as well as sign-to-text (S2T). We currently implement SignNet as a single channel architecture so that the output of the T2S task can be fed into S2T in a continuous dual learning framework. By single channel, we refer to a single modality, the body pose joints. In this work, we present SignNet, a T2S task using a novel metric embedding learning process, to preserve the distances between sign embeddings relative to their dissimilarity. We also describe how to choose positive and negative examples of signs for similarity testing. From our analysis, we observe that metric embedding learning-based model perform significantly better than the other models with traditional losses, when evaluated using BLEU scores. In the task of gloss to pose, SignNet performed as well as its state-of-the-art (SoTA) counterparts and outperformed them in the task of text to pose, by showing noteworthy enhancements in BLEU 1 - BLEU 4 scores (BLEU 1: 31->39; ~26% improvement and BLEU 4: 10.43->11.84; ~14\% improvement) when tested on the popular RWTH PHOENIX-Weather-2014T benchmark dataset

machine learning, natural language, signnet, (19 more...)

2212.02848

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
Europe > Germany > Saxony > Dresden (0.04)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (0.62)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)

Baziotis, Christos, Artetxe, Mikel, Cross, James, Bhosale, Shruti

Multilingual Machine Translation with Hyper-Adapters

arXiv.org Artificial IntelligenceDec-5-2022

Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks using hyper-adapters -- hyper-networks that generate adapters from language and layer embeddings. While past work had poor results when scaling hyper-networks, we propose a rescaling fix that significantly improves convergence and enables training larger hyper-networks. We find that hyper-adapters are more parameter efficient than regular adapters, reaching the same performance with up to 12 times less parameters. When using the same number of parameters and FLOPS, our approach consistently outperforms regular adapters. Also, hyper-adapters converge faster than alternative approaches and scale better than regular dense networks. Our analysis shows that hyper-adapters learn to encode language relatedness, enabling positive transfer across languages.

artificial intelligence, machine learning, natural language, (16 more...)

2205.10835

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(9 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)