AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Neural Information Processing SystemsFeb-8-2026, 21:46:07 GMT

Appendix for Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen

Finally, we describe the training setup for our back-translation experiments. We continue to differentiate our method from other existing works. Our method does not train multiple peer models with EM training either. In each round, a forward (or backward) model takes turn to play the "back-translation" role to train The role is switched in the next round. In other words, source and target are identical.

artificial intelligence, experiment, natural language, (16 more...)

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada (0.04)
Europe > Germany > Berlin (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Neural Information Processing SystemsFeb-8-2026, 21:46:00 GMT

DataDiversification: ASimpleStrategyForNeural MachineTranslation

Our method is applicable to all NMT models. It does not require extra monolingual data like back-translation, nor does it add more computations and parameters like ensembles ofmodels.

machine learning, natural language, urlhttp, (18 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.98)

Neural Information Processing SystemsOct-3-2025, 05:38:27 GMT

Appendix for Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen

artificial intelligence, experiment, natural language, (16 more...)

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada (0.04)
Europe > Germany > Berlin (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Neural Information Processing SystemsAug-20-2025, 01:03:12 GMT

c04c19c2c2474dbf5f7ac4372c5b9af1-AuthorFeedback.pdf

decoder, experiment, parallel sentence, (10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Sadrizadeh, Sahar, Dolamic, Ljiljana, Frossard, Pascal

A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation

arXiv.org Artificial IntelligenceFeb-22-2024

Neural Machine Translation (NMT) models have been shown to be vulnerable to adversarial attacks, wherein carefully crafted perturbations of the input can mislead the target model. In this paper, we introduce ACT, a novel adversarial attack framework against NMT systems guided by a classifier. In our attack, the adversary aims to craft meaning-preserving adversarial examples whose translations in the target language by the NMT model belong to a different class than the original translations. Unlike previous attacks, our new approach has a more substantial effect on the translation by altering the overall meaning, which then leads to a different class determined by an oracle classifier. To evaluate the robustness of NMT models to our attack, we propose enhancements to existing black-box word-replacement-based attacks by incorporating output translations of the target NMT model and the output logits of a classifier within the attack process. Extensive experiments, including a comparison with existing untargeted attacks, show that our attack is considerably more successful in altering the class of the output translation and has more effect on the translation. This new paradigm can reveal the vulnerabilities of NMT systems by focusing on the class of translation rather than the mere translation quality as studied traditionally.

classifier, nmt model, translation, (15 more...)

2308.15246

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Switzerland (0.04)
Europe > Germany (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Tourni, Isidora Chara, Wijaya, Derry

An Empirical study of Unsupervised Neural Machine Translation: analyzing NMT output, model's behavior and sentences' contribution

arXiv.org Artificial IntelligenceDec-19-2023

Unsupervised Neural Machine Translation (UNMT) focuses on improving NMT results under the assumption there is no human translated parallel data, yet little work has been done so far in highlighting its advantages compared to supervised methods and analyzing its output in aspects other than translation accuracy. We focus on three very diverse languages, French, Gujarati, and Kazakh, and train bilingual NMT models, to and from English, with various levels of supervision, in high- and low- resource setups, measure quality of the NMT output and compare the generated sequences' word order and semantic similarity to source and reference sentences. We also use Layer-wise Relevance Propagation to evaluate the source and target sentences' contribution to the result, expanding the findings of previous works to the UNMT paradigm.

contribution, experiment, translation, (13 more...)

2312.12588

Country:

North America > United States > New York (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Artificial IntelligenceDec-11-2023

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Choi, Dami, Xin, Derrick, Dadkhahi, Hamid, Gilmer, Justin, Garg, Ankush, Firat, Orhan, Yeh, Chih-Kuan, Dai, Andrew M., Ghorbani, Behrooz

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

cross-entropy loss, train cross-entropy loss, valid cross-entropy loss, (13 more...)

2312.06134

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceApr-10-2023

Rethinking GNN-based Entity Alignment on Heterogeneous Knowledge Graphs: New Datasets and A New Method

Jiang, Xuhui, Xu, Chengjin, Shen, Yinghan, Su, Fenglong, Wang, Yuanzhuo, Sun, Fei, Li, Zixuan, Shen, Huawei

The development of knowledge graph (KG) applications has led to a rising need for entity alignment (EA) between heterogeneous KGs that are extracted from various sources. Recently, graph neural networks (GNNs) have been widely adopted in EA tasks due to GNNs' impressive ability to capture structure information. However, we have observed that the oversimplified settings of the existing common EA datasets are distant from real-world scenarios, which obstructs a full understanding of the advancements achieved by recent methods. This phenomenon makes us ponder: Do existing GNN-based EA methods really make great progress? In this paper, to study the performance of EA methods in realistic settings, we focus on the alignment of highly heterogeneous KGs (HHKGs) (e.g., event KGs and general KGs) which are different with regard to the scale and structure, and share fewer overlapping entities. First, we sweep the unreasonable settings, and propose two new HHKG datasets that closely mimic real-world EA scenarios. Then, based on the proposed datasets, we conduct extensive experiments to evaluate previous representative EA methods, and reveal interesting findings about the progress of GNN-based EA methods. We find that the structural information becomes difficult to exploit but still valuable in aligning HHKGs. This phenomenon leads to inferior performance of existing EA methods, especially GNN-based methods. Our findings shed light on the potential problems resulting from an impulsive application of GNN-based methods as a panacea for all EA datasets. Finally, we introduce a simple but effective method: Simple-HHEA, which comprehensively utilizes entity name, structure, and temporal information. Experiment results show Simple-HHEA outperforms previous models on HHKG datasets.

information, machine learning, natural language, (16 more...)

2304.03468

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Dominican Republic (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.63)

arXiv.org Artificial IntelligenceSep-9-2022

Paraphrase Generation as Unsupervised Machine Translation

Sun, Xiaofei, Tian, Yufei, Meng, Yuxian, Peng, Nanyun, Wu, Fei, Li, Jiwei, Fan, Chun

In this paper, we propose a new paradigm for paraphrase generation by treating the task as unsupervised machine translation (UMT) based on the assumption that there must be pairs of sentences expressing the same meaning in a large-scale unlabeled monolingual corpus. The proposed paradigm first splits a large unlabeled corpus into multiple clusters, and trains multiple UMT models using pairs of these clusters. Then based on the paraphrase pairs produced by these UMT models, a unified surrogate model can be trained to serve as the final \sts model to generate paraphrases, which can be directly used for test in the unsupervised setup, or be finetuned on labeled datasets in the supervised setup. The proposed method offers merits over machine-translation-based paraphrase generation methods, as it avoids reliance on bilingual sentence pairs. It also allows human intervene with the model so that more diverse paraphrases can be generated using different filtering criteria. Extensive experiments on existing paraphrase dataset for both the supervised and unsupervised setups demonstrate the effectiveness the proposed paradigm.

machine translation, preprint arxiv, translation, (14 more...)

2109.0295

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Germany > Berlin (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(12 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)