AITopics

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York > Monroe County > Rochester (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.35)

AAAI ConferencesJul-14-2014

Machine Translation with Real-Time Web Search

Cui, Lei (Harbin Institute of Technology) | Zhou, Ming (Microsoft Research) | Chen, Qiming (Shanghai Jiao Tong University) | Zhang, Dongdong (Microsoft Research) | Li, Mu (Microsoft Research)

Contemporary machine translation systems usually rely on offline data retrieved from the web for individual model training, such as translation models and language models. In contrast to existing methods, we propose a novel approach that treats machine translation as a web search task and utilizes the web on the fly to acquire translation knowledge. This end-to-end approach takes advantage of fresh web search results that are capable of leveraging tremendous web knowledge to obtain phrase-level candidates on demand and then compose sentence-level translations. Experimental results show that our web-based machine translation method demonstrates very promising performance in leveraging fresh translation knowledge and making translation decisions. Furthermore, when combined with offline models, it significantly outperforms a state-of-the-art phrase-based statistical machine translation system.

artificial intelligence, natural language, translation, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.69)
Asia > China (0.68)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Journal of Artificial Intelligence ResearchMay-8-2014

Topic-Based Dissimilarity and Sensitivity Models for Translation Rule Selection

Zhang, M., Xiao, X., Xiong, D., Liu, Q.

Translation rule selection is a task of selecting appropriate translation rules for an ambiguous source-language segment. As translation ambiguities are pervasive in statistical machine translation, we introduce two topic-based models for translation rule selection which incorporates global topic information into translation disambiguation. We associate each synchronous translation rule with source- and target-side topic distributions.With these topic distributions, we propose a topic dissimilarity model to select desirable (less dissimilar) rules by imposing penalties for rules with a large value of dissimilarity of their topic distributions to those of given documents. In order to encourage the use of non-topic specific translation rules, we also present a topic sensitivity model to balance translation rule selection between generic rules and topic-specific rules. Furthermore, we project target-side topic distributions onto the source-side topic model space so that we can benefit from topic information of both the source and target language. We integrate the proposed topic dissimilarity and sensitivity model into hierarchical phrase-based machine translation for synchronous translation rule selection. Experiments show that our topic-based translation rule selection model can substantially improve translation quality.

topic model, translation, translation rule, (12 more...)

doi: 10.1613/jair.4265

AI Access Foundation

10878

Country:

Europe > Czechia > Prague (0.04)
Asia > South Korea (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(14 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

AAAI ConferencesMay-7-2014

Mining Named Entity Translation from Non Parallel Corpora

Sellami, Rahma (MIRACL Sfax University) | Sadat, Fatiha (UQAM) | Belguith, Lamia Hadrich (MIRACL Sfax University)

In this paper, we address the problem of mining named entity translation such as names of persons, organizations, and locations, from non parallel corpora. First, our study concentrates of different forms of named entity translation. Then, we introduce a new framework to extract all named entity translation types from a non parallel corpus. The proposed framework combines surface and linguistic-based approaches. It is language independent and do not rely on any external parallel resources such as bilingual lexicons or parallel corpora. Evaluations show that our approach for mining named entity translations from a non parallel corpus is highly effective and consistently improves the translation quality of Arabic to French machine translation system.

entity translation, mining, non parallel corpora

The Twenty-Seventh International Flairs Conference

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

AAAI ConferencesMay-7-2014

Comparison of Google Translation with Human Translation

Li, Haiying (University of Memphis) | Graesser, Arthur C. (University of Memphis) | Cai, Zhiqiang (University of Memphis)

Google Translate provides a multilingual machine-translation service by automatically translating one written language to another. Google translate is allegedly limited in its accuracy in translation, however. This study investigated the accuracy of Google Chinese-to-English translation from the perspectives of formality and cohesion with two comparisons: Google translation with human expert translation, and Google translation with Chinese source language. The text sample was a collection of 289 spoken and written texts excerpts from the Selected Works of Mao Zedong in both Chinese and English versions. Google translate was used to translate the Chinese texts into English. These texts were analyzed by the automated text analysis tools: the Chinese and English LIWC, and the Chinese and English Coh-Metrix. Results of Pearson correlations on formality and cohesion showed Google English translation was highly correlated with both human English translation and the original Chinese texts.

google translation, human translation, translation

The Twenty-Seventh International Flairs Conference

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

P, Sarath Chandar A, Lauly, Stanislas, Larochelle, Hugo, Khapra, Mitesh M., Ravindran, Balaraman, Raykar, Vikas, Saha, Amrita

An Autoencoder Approach to Learning Bilingual Word Representations

arXiv.org Machine LearningFeb-6-2014

Cross-language learning allows us to use training data from one language to build models for a different language. Many approaches to bilingual learning require that we have word-level alignment of sentences from parallel corpora. In this work we explore the use of autoencoder-based methods for cross-language learning of vectorial word representations that are aligned between two languages, while not relying on word-level alignments. We show that by simply learning to reconstruct the bag-of-words representations of aligned sentences, within and between languages, we can in fact learn high-quality representations and do without word alignments. Since training autoencoders on word observations presents certain computational issues, we propose and compare different variations adapted to this setting. We also propose an explicit correlation maximizing regularizer that leads to significant improvement in the performance. We empirically investigate the success of our approach on the problem of cross-language test classification, where a classifier trained on a given language (e.g., English) must learn to generalize to a different language (e.g., German). These experiments demonstrate that our approaches are competitive with the state-of-the-art, achieving up to 10-14 percentage point improvements over the best reported results on this task.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1402.1454

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.85)

Journal of Artificial Intelligence ResearchNov-22-2013

Unsupervised Sub-tree Alignment for Tree-to-Tree Translation

Xiao, T., Zhu, J.

This article presents a probabilistic sub-tree alignment model and its application to tree-to-tree machine translation. Unlike previous work, we do not resort to surface heuristics or expensive annotated data, but instead derive an unsupervised model to infer the syntactic correspondence between two languages. More importantly, the developed model is syntactically-motivated and does not rely on word alignments. As a by-product, our model outputs a sub-tree alignment matrix encoding a large number of diverse alignments between syntactic structures, from which machine translation systems can efficiently extract translation rules that are often filtered out due to the errors in 1-best alignment. Experimental results show that the proposed approach outperforms three state-of-the-art baseline approaches in both alignment accuracy and grammar quality. When applied to machine translation, our approach yields a +1.0 BLEU improvement and a -0.9 TER reduction on the NIST machine translation evaluation corpora. With tree binarization and fuzzy decoding, it even outperforms a state-of-the-art hierarchical phrase-based system.

alignment, probability, sub-tree alignment, (13 more...)

doi: 10.1613/jair.4033

AI Access Foundation

10850

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Singapore (0.04)
(25 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Costa-jussà, Marta R. (Institute for Infocomm Research) | Henríquez, Carlos (Universitat Politècnica de Catalunya) | Banchs, Rafael E. (Institute for Infocomm Research)

Evaluating Indirect Strategies for Chinese — Spanish Statistical Machine Translation: Extended Abstract

AAAI ConferencesAug-3-2013

Although, Chinese and Spanish are two of the most spoken languages in the world, not much research has been done in machine translation for this language pair. This paper focuses on investigating the state-of-the-art of Chinese-to-Spanish statistical machine translation (SMT), which nowadays is one of the most popular approaches to machine translation. We conduct experimental work with the largest of these three corpora to explore alternative SMT strategies by means of using a pivot language. Three alternatives are considered for pivoting: cascading, pseudo-corpus and triangulation. As pivot language, we use either English, Arabic or French. Results show that, for a phrase-based SMT system, English is the best pivot language between Chinese and Spanish. We propose a system output combination using the pivot strategies which is capable of outperforming the direct translation strategy. The main objective of this work is motivating and involving the research community to work in this important pair of languages given their demographic impact.

extended abstract, indirect strategy, spanish statistical machine translation

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

AAAI ConferencesAug-3-2013

Fusion of Word and Letter Based Metrics for Automatic MT Evaluation

Yang, Muyun (Harbin Institute of Technology) | Zhu, Junguo (Harbin Institute of Technology) | Li, Sheng (Harbin Institute of Technology) | Zhao, Tiejun (Harbin Institute of Technology)

With the progress in machine translation, it becomes more subtle to develop the evaluation metric capturing the systems’ differences in comparison to the human translations. In contrast to the current efforts in leveraging more linguistic information to depict translation quality, this paper takes the thread of combining language independent features for a robust solution to MT evaluation metric. To compete with finer granularity of modeling brought by linguistic features, the proposed method augments the word level metrics by a letter based calculation. An empirical study is then conducted over WMT data to train the metrics by ranking SVM. The results reveal that the integration of current language independent metrics can generate well enough performance for a variety of languages. Time-split data validation is promising as a better training setting, though the greedy strategy also works well.

automatic mt evaluation, fusion, word and letter, (1 more...)

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.87)

AAAI ConferencesAug-3-2013

Modeling Lexical Cohesion for Document-Level Machine Translation

Xiong, Deyi (Soochow University) | Ben, Guosheng (Institute of Computing Technology) | Zhang, Min (Soochow University) | Lv, Yajuan (Institute of Computing Technology) | Liu, Qun (Dublin City University)

Lexical cohesion arises from a chain of lexical items that establish links between sentences in a text. In this paper we propose three different models to capture lexical cohesion for document-level machine translation: (a) a direct reward model where translation hypotheses are rewarded whenever lexical cohesion devices occur in them, (b) a conditional probability model where the appropriateness of using lexical cohesion devices is measured, and (c) a mutual information trigger model where a lexical cohesion relation is considered as a trigger pair and the strength of the association between the trigger and the triggered item is estimated by mutual information. We integrate the three models into hierarchical phrase-based machine translation and evaluate their effectiveness on the NIST Chinese-English translation tasks with large-scale training data. Experiment results show that all three models can achieve substantial improvements over the baseline and that the mutual information trigger model performs better than the others.

document-level machine translation, modeling lexical cohesion

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.80)