Goto

Collaborating Authors

 en-fr



Appendix for Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen

Neural Information Processing Systems

Finally, we describe the training setup for our back-translation experiments. We continue to differentiate our method from other existing works. Our method does not train multiple peer models with EM training either. In each round, a forward (or backward) model takes turn to play the "back-translation" role to train The role is switched in the next round. In other words, source and target are identical.



Appendix for Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen

Neural Information Processing Systems

Finally, we describe the training setup for our back-translation experiments. We continue to differentiate our method from other existing works. Our method does not train multiple peer models with EM training either. In each round, a forward (or backward) model takes turn to play the "back-translation" role to train The role is switched in the next round. In other words, source and target are identical.



A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation

arXiv.org Artificial Intelligence

Neural Machine Translation (NMT) models have been shown to be vulnerable to adversarial attacks, wherein carefully crafted perturbations of the input can mislead the target model. In this paper, we introduce ACT, a novel adversarial attack framework against NMT systems guided by a classifier. In our attack, the adversary aims to craft meaning-preserving adversarial examples whose translations in the target language by the NMT model belong to a different class than the original translations. Unlike previous attacks, our new approach has a more substantial effect on the translation by altering the overall meaning, which then leads to a different class determined by an oracle classifier. To evaluate the robustness of NMT models to our attack, we propose enhancements to existing black-box word-replacement-based attacks by incorporating output translations of the target NMT model and the output logits of a classifier within the attack process. Extensive experiments, including a comparison with existing untargeted attacks, show that our attack is considerably more successful in altering the class of the output translation and has more effect on the translation. This new paradigm can reveal the vulnerabilities of NMT systems by focusing on the class of translation rather than the mere translation quality as studied traditionally.


An Empirical study of Unsupervised Neural Machine Translation: analyzing NMT output, model's behavior and sentences' contribution

arXiv.org Artificial Intelligence

Unsupervised Neural Machine Translation (UNMT) focuses on improving NMT results under the assumption there is no human translated parallel data, yet little work has been done so far in highlighting its advantages compared to supervised methods and analyzing its output in aspects other than translation accuracy. We focus on three very diverse languages, French, Gujarati, and Kazakh, and train bilingual NMT models, to and from English, with various levels of supervision, in high- and low- resource setups, measure quality of the NMT output and compare the generated sequences' word order and semantic similarity to source and reference sentences. We also use Layer-wise Relevance Propagation to evaluate the source and target sentences' contribution to the result, expanding the findings of previous works to the UNMT paradigm.


Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

arXiv.org Artificial Intelligence

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.


Rethinking GNN-based Entity Alignment on Heterogeneous Knowledge Graphs: New Datasets and A New Method

arXiv.org Artificial Intelligence

The development of knowledge graph (KG) applications has led to a rising need for entity alignment (EA) between heterogeneous KGs that are extracted from various sources. Recently, graph neural networks (GNNs) have been widely adopted in EA tasks due to GNNs' impressive ability to capture structure information. However, we have observed that the oversimplified settings of the existing common EA datasets are distant from real-world scenarios, which obstructs a full understanding of the advancements achieved by recent methods. This phenomenon makes us ponder: Do existing GNN-based EA methods really make great progress? In this paper, to study the performance of EA methods in realistic settings, we focus on the alignment of highly heterogeneous KGs (HHKGs) (e.g., event KGs and general KGs) which are different with regard to the scale and structure, and share fewer overlapping entities. First, we sweep the unreasonable settings, and propose two new HHKG datasets that closely mimic real-world EA scenarios. Then, based on the proposed datasets, we conduct extensive experiments to evaluate previous representative EA methods, and reveal interesting findings about the progress of GNN-based EA methods. We find that the structural information becomes difficult to exploit but still valuable in aligning HHKGs. This phenomenon leads to inferior performance of existing EA methods, especially GNN-based methods. Our findings shed light on the potential problems resulting from an impulsive application of GNN-based methods as a panacea for all EA datasets. Finally, we introduce a simple but effective method: Simple-HHEA, which comprehensively utilizes entity name, structure, and temporal information. Experiment results show Simple-HHEA outperforms previous models on HHKG datasets.


Paraphrase Generation as Unsupervised Machine Translation

arXiv.org Artificial Intelligence

In this paper, we propose a new paradigm for paraphrase generation by treating the task as unsupervised machine translation (UMT) based on the assumption that there must be pairs of sentences expressing the same meaning in a large-scale unlabeled monolingual corpus. The proposed paradigm first splits a large unlabeled corpus into multiple clusters, and trains multiple UMT models using pairs of these clusters. Then based on the paraphrase pairs produced by these UMT models, a unified surrogate model can be trained to serve as the final \sts model to generate paraphrases, which can be directly used for test in the unsupervised setup, or be finetuned on labeled datasets in the supervised setup. The proposed method offers merits over machine-translation-based paraphrase generation methods, as it avoids reliance on bilingual sentence pairs. It also allows human intervene with the model so that more diverse paraphrases can be generated using different filtering criteria. Extensive experiments on existing paraphrase dataset for both the supervised and unsupervised setups demonstrate the effectiveness the proposed paradigm.