AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

TensorFlow Addons Networks : Sequence-to-Sequence NMT with Attention Mechanism

#artificialintelligenceJan-15-2021, 19:20:17 GMT

The basic idea behind such a model though, is only the encoder-decoder architecture. These networks are usually used for a variety of tasks like text-summerization, Machine translation, Image Captioning, etc.

attention mechanism, sequence-to-sequence nmt, tensorflow addon network, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SG-Net: Syntax Guided Transformer for Language Representation

Zhang, Zhuosheng, Wu, Yuwei, Zhou, Junru, Duan, Sufeng, Zhao, Hai, Wang, Rui

arXiv.org Artificial IntelligenceJan-7-2021

Understanding human language is one of the key themes of artificial intelligence. For language representation, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy texts and getting rid of the noises is essential to improve its performance. Traditional attentive models attend to all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanisms for better linguistically motivated word representations. In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention. Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the SAN from the original Transformer encoder through a dual contextual architecture for better linguistics inspired representation. The proposed SG-Net is applied to typical Transformer encoders. Extensive experiments on popular benchmark tasks, including machine reading comprehension, natural language inference, and neural machine translation show the effectiveness of the proposed SG-Net design.

computational linguistic, proceedings, representation, (15 more...)

arXiv.org Artificial Intelligence

2012.13915

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > Missouri (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(3 more...)

Add feedback

Can Everybody Sign Now? Exploring Sign Language Video Generation from 2D Poses

Ventura, Lucas, Duarte, Amanda, Giro-i-Nieto, Xavier

arXiv.org Artificial IntelligenceJan-4-2021

Sign Language is the primary means of communication of the Deaf community but barely known by the rest of the population. This situation creates difficulties in conversations between sign and non-sign language speakers, which are normally addressed with textual transcriptions of the spoken language, or the sign-speakers developing lipreading and oral communication skills. The communication barrier between sign and non-sign language speakers may be reduced in the coming years thanks to the recent advances in neural machine translation and computer vision. Recent works [5,6,9] are making steps towards sign language translation by automatically generating detailed human pose skeletons from spoken language. Skeletons are represented by 2D/3D coordinates of human joints also known as keypoints; given a set of estimated keypoints, one can visualize them as a wired skeleton connecting the modeled joints (see the middle row of Figure 1).

keypoint, skeleton visualization, video, (10 more...)

arXiv.org Artificial Intelligence

2012.10941

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.05)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation

Han, Hyojung, Indurthi, Sathish, Zaidi, Mohd Abbas, Lakumarapu, Nikhil Kumar, Lee, Beomseok, Kim, Sangha, Kim, Chanwoo, Hwang, Inchul

arXiv.org Artificial IntelligenceDec-29-2020

Recently, simultaneous translation has gathered a lot of attention since it enables compelling applications such as subtitle translation for a live event or real-time video-call translation. Some of these translation applications allow editing of partial translation giving rise to re-translation approaches. The current re-translation approaches are based on autoregressive sequence generation models (ReTA), which generate tar-get tokens in the (partial) translation sequentially. The multiple re-translations with sequential generation inReTAmodelslead to an increased inference time gap between the incoming source input and the corresponding target output as the source input grows. Besides, due to the large number of inference operations involved, the ReTA models are not favorable for resource-constrained devices. In this work, we propose a faster re-translation system based on a non-autoregressive sequence generation model (FReTNA) to overcome the aforementioned limitations. We evaluate the proposed model on multiple translation tasks and our model reduces the inference times by several orders and achieves a competitive BLEUscore compared to the ReTA and streaming (Wait-k) models.The proposed model reduces the average computation time by a factor of 20 when compared to the ReTA model by incurring a small drop in the translation quality. It also outperforms the streaming-based Wait-k model both in terms of computation time (1.5 times lower) and translation quality.

classifier, sequence, translation, (16 more...)

arXiv.org Artificial Intelligence

2012.14681

Country:

North America > United States > North Carolina (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Germany > Berlin (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Neural Text Generation with Artificial Negative Examples

Shirai, Keisuke, Hashimoto, Kazuma, Eriguchi, Akiko, Ninomiya, Takashi, Mori, Shinsuke

arXiv.org Artificial IntelligenceDec-28-2020

Neural text generation models conditioning on given input (e.g. machine translation and image captioning) are usually trained by maximum likelihood estimation of target text. However, the trained models suffer from various types of errors at inference time. In this paper, we propose to suppress an arbitrary type of errors by training the text generation model in a reinforcement learning framework, where we use a trainable reward function that is capable of discriminating between references and sentences containing the targeted type of errors. We create such negative examples by artificially injecting the targeted errors to the references. In experiments, we focus on two error types, repeated and dropped tokens in model-generated text. The experimental results show that our method can suppress the generation errors and achieve significant improvements on two machine translation and two image captioning tasks.

discriminator, generation model, machine translation, (14 more...)

arXiv.org Artificial Intelligence

2012.14124

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(4 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
(2 more...)

Add feedback

Future-Guided Incremental Transformer for Simultaneous Translation

Zhang, Shaolei, Feng, Yang, Li, Liangyou

arXiv.org Artificial IntelligenceDec-22-2020

Simultaneous translation (ST) starts translations synchronously while reading source sentences, and is used in many online scenarios. The previous wait-k policy is concise and achieved good results in ST. However, wait-k policy faces two weaknesses: low training speed caused by the recalculation of hidden states and lack of future source information to guide training. For the low training speed, we propose an incremental Transformer with an average embedding layer (AEL) to accelerate the speed of calculation of the hidden states during training. For future-guided training, we propose a conventional Transformer as the teacher of the incremental Transformer, and try to invisibly embed some future information in the model through knowledge distillation. We conducted experiments on Chinese-English and German-English simultaneous translation tasks and compared with the wait-k policy to evaluate the proposed method. Our method can effectively increase the training speed by about 28 times on average at different k and implicitly embed some predictive abilities in the model, achieving better translation quality than wait-k baseline.

baseline, conventional transformer, transformer, (13 more...)

arXiv.org Artificial Intelligence

2012.12465

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
(14 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Distributional Approach to Controlled Text Generation

Khalifa, Muhammad, Elsahar, Hady, Dymetman, Marc

arXiv.org Artificial IntelligenceDec-21-2020

We propose a Distributional Approach to address Controlled Text Generation from pre-trained Language Models (LMs). This view permits to define, in a single formal framework, "pointwise" and "distributional" constraints over the target LM -- to our knowledge, this is the first approach with such generality -- while minimizing KL divergence with the initial LM distribution. The optimal target distribution is then uniquely determined as an explicit EBM (Energy-Based Model) representation. From that optimal representation we then train the target controlled autoregressive LM through an adaptive distributional variant of Policy Gradient. We conduct a first set of experiments over pointwise constraints showing the advantages of our approach over a set of baselines, in terms of obtaining a controlled LM balancing constraint satisfaction with divergence from the initial LM (GPT-2). We then perform experiments over distributional constraints, a unique feature of our approach, demonstrating its potential as a remedy to the problem of Bias in Language Models. Through an ablation study we show the effectiveness of our adaptive technique for obtaining faster convergence.

constraint, iclr 2021, wikileaks, (14 more...)

arXiv.org Artificial Intelligence

2012.11635

Country:

Europe > United Kingdom (0.27)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Czechia (0.05)
(66 more...)

Genre:

Personal (0.92)
Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Media > Television (1.00)
Media > Film (1.00)
Law > Civil Rights & Constitutional Law (1.00)
(19 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Fundamental Limits and Tradeoffs in Invariant Representation Learning

Zhao, Han, Dan, Chen, Aragam, Bryon, Jaakkola, Tommi S., Gordon, Geoffrey J., Ravikumar, Pradeep

arXiv.org Machine LearningDec-19-2020

Many machine learning applications involve learning representations that achieve two competing goals: To maximize information or accuracy with respect to a subset of features (e.g.\ for prediction) while simultaneously maximizing invariance or independence with respect to another, potentially overlapping, subset of features (e.g.\ for fairness, privacy, etc). Typical examples include privacy-preserving learning, domain adaptation, and algorithmic fairness, just to name a few. In fact, all of the above problems admit a common minimax game-theoretic formulation, whose equilibrium represents a fundamental tradeoff between accuracy and invariance. Despite its abundant applications in the aforementioned domains, theoretical understanding on the limits and tradeoffs of invariant representations is severely lacking. In this paper, we provide an information-theoretic analysis of this general and important problem under both classification and regression settings. In both cases, we analyze the inherent tradeoffs between accuracy and invariance by providing a geometric characterization of the feasible region in the information plane, where we connect the geometric properties of this feasible region to the fundamental limitations of the tradeoff problem. In the regression setting, we also derive a tight lower bound on the Lagrangian objective that quantifies the tradeoff between accuracy and invariance. This lower bound leads to a better understanding of the tradeoff via the spectral properties of the joint distribution. In both cases, our results shed new light on this fundamental problem by providing insights on the interplay between accuracy and invariance. These results deepen our understanding of this fundamental problem and may be useful in guiding the design of adversarial representation learning algorithms.

representation, theorem 5, var, (15 more...)

arXiv.org Machine Learning

2012.10713

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Efficient Object-Level Visual Context Modeling for Multimodal Machine Translation: Masking Irrelevant Objects Helps Grounding

Wang, Dexin, Xiong, Deyi

arXiv.org Artificial IntelligenceDec-18-2020

Visual context provides grounding information for multimodal machine translation (MMT). However, previous MMT models and probing studies on visual features suggest that visual information is less explored in MMT as it is often redundant to textual information. In this paper, we propose an object-level visual context modeling framework (OVC) to efficiently capture and explore visual information for multimodal machine translation. With detected objects, the proposed OVC encourages MMT to ground translation on desirable visual objects by masking irrelevant objects in the visual modality. We equip the proposed with an additional object-masking loss to achieve this goal. The object-masking loss is estimated according to the similarity between masked objects and the source texts so as to encourage masking source-irrelevant objects. Additionally, in order to generate vision-consistent target words, we further propose a vision-weighted translation loss for OVC. Experiments on MMT datasets demonstrate that the proposed OVC model outperforms state-of-the-art MMT models and analyses show that masking irrelevant objects helps grounding in MMT.

ovc, source text, translation, (14 more...)

arXiv.org Artificial Intelligence

2101.05208

Country:

Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
Europe > Germany > Berlin (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Finding Sparse Structure for Domain Specific Neural Machine Translation

Liang, Jianze, Zhao, Chengqi, Wang, Mingxuan, Qiu, Xipeng, Li, Lei

arXiv.org Artificial IntelligenceDec-18-2020

Fine-tuning is a major approach for domain adaptation in Neural Machine Translation (NMT). However, unconstrained fine-tuning requires very careful hyper-parameter tuning otherwise it is easy to fall into over-fitting on the target domain and degradation on the general domain. To mitigate it, we propose PRUNE-TUNE, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific subnetworks for tuning. During adaptation to a new domain, we only tune its corresponding subnetwork. PRUNE-TUNE alleviates the over-fitting and the degradation problem without model modification. Additionally, with no overlapping between domain-specific subnetworks, PRUNE-TUNE is also capable of sequential multi-domain learning. Empirical experiment results show that PRUNE-TUNE outperforms several strong competitors in the target domain test set without the quality degradation of the general domain in both single and multiple domain settings.

domain adaptation, proceedings, subnetwork, (10 more...)

arXiv.org Artificial Intelligence

2012.10586

Country: Asia > China (0.04)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback