AITopics | Machine Translation

Collaborating Authors

Machine Translation

"Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains."
– Definition from the European Association for Machine Translation (EAMT).

You can translate text of your choice by using free translators such as: CAPITA, Google Translate, SDL International, SYSTRAN.

News Overviews Instructional Materials AI-Alerts Classics

Learning Inter-Related Statistical Query Translation Models for English-Chinese Bi-Directional CLIR

Zhang, Yuejie (Fudan University) | Cen, Lei (Fudan University) | Jin, Cheng (Fudan University) | Xue, Xiangyang (Fudan University) | Fan, Jianping (The University of North Carolina at Charlotte)

AAAI ConferencesJul-19-2011

To support more precise query translation for English-Chinese Bi-Directional Cross-Language Information Retrieval (CLIR), we have developed a novel framework by integrating a semantic network to characterize the correlations between multiple inter-related text terms of interest and learn their inter-related statistical query translation models. First, a semantic network is automatically generated from large-scale English-Chinese bilingual parallel corpora to characterize the correlations between a large number of text terms of interest. Second, the semantic network is exploited to learn the statistical query translation models for such text terms of interest. Finally, these inter-related query translation models are used to translate the queries more precisely and achieve more effective CLIR. Our experiments on a large number of official public data have obtained very positive results.

semantic network, text term, translation, (16 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > North Carolina (0.04)
Asia > China > Hong Kong (0.04)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Add feedback

Constraint Optimization Approach to Context Based Word Selection

Matsuno, Jun (Kyoto University) | Ishida, Toru (Kyoto University)

AAAI ConferencesJul-19-2011

Consistent word selection in machine translation is currently realized by resolving word sense ambiguity through the context of a single sentence or neighboring sentences. However, consistent word selection over the whole article has yet to be achieved. Consistency over the whole article is extremely important when applying machine translation to collectively developed documents like Wikipedia. In this paper, we propose to consider constraints between words in the whole article based on their semantic relatedness and contextual distance. The proposed method is successfully implemented in both statistical and rule-based translators. We evaluate those systems by translating 100 articles in the English Wikipedia into Japanese. The results show that the ratio of appropriate word selection for common nouns increased to around 75% with our method, while it was around 55% without our method.

selection, semantic relatedness, word selection, (12 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

SMT Versus AI Redux: How Semantic Frames Evaluate MT More Accurately

Lo, Chi-kiu (Hong Kong University of Science and Technology) | Wu, Dekai (Hong Kong University of Science and Technology)

AAAI ConferencesJul-19-2011

We argue for an alternative paradigm in evaluating machine translation quality that is strongly empirical but more accurately reflects the utility of translations, by returning to a representational foundation based on AI oriented lexical semantics, rather than the superficial flat n-gram and string representations recently dominating the field. Driven by such metrics as BLEU and WER, current SMT frequently produces unusable translations where the semantic event structure is mistranslated: who did what to whom, when, where, why, and how? We argue that it is time for a new generation of more “intelligent” automatic and semi-automatic metrics, based clearly on getting the structure right at the lexical semantics level. We show empirically that it is possible to use simple PropBank style semantic frame representations to surpass all currently widespread metrics' correlation to human adequacy judgments, including even HTER. We also show that replacing human annotators with automatic semantic role labeling still yields much of the advantage of the approach. We combine the best of both worlds: from an SMT perspective, we provide superior yet low-cost quantitative objective functions for translation quality; and yet from an AI perspective, we regain the representational transparency and clear reflection of semantic utility of structural frame-based knowledge representations.

correlation, evaluation, translation, (16 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Czechia > Prague (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Translation of Pronominal Anaphora between English and Spanish: Discrepancies and Evaluation

Ferrandez, A., Peral, J.

arXiv.org Artificial IntelligenceJun-23-2011

This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into English---issues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish zero pronouns) into the target language has been evaluated on unrestricted corpora. We have obtained a precision of 80.4% and 84.8% in the translation of Spanish and English pronouns, respectively. Although we have only studied the Spanish and English languages, our approach can be easily extended to other languages such as Portuguese, Italian, or Japanese.

artificial intelligence, machine translation, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1115

1106.4862

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Spain > Valencian Community > Alicante Province > Alicante (0.04)
(25 more...)

Genre: Research Report > New Finding (0.67)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Correction of Noisy Sentences using a Monolingual Corpus

Chatterhee, Diptesh

arXiv.org Artificial IntelligenceMay-22-2011

Correction of Noisy Natural Language Text is an important and well studied problem in Natural Language Processing. It has a number of applications in domains like Statistical Machine Translation, Second Language Learning and Natural Language Generation. In this work, we consider some statistical techniques for Text Correction. We define the classes of errors commonly found in text and describe algorithms to correct them. The data has been taken from a poorly trained Machine Translation system. The algorithms use only a language model in the target language in order to correct the sentences. We use phrase based correction methods in both the algorithms. The phrases are replaced and combined to give us the final corrected sentence. We also present the methods to model different kinds of errors, in addition to results of the working of the algorithms on the test set. We show that one of the approaches fail to achieve the desired goal, whereas the other succeeds well. In the end, we analyze the possible reasons for such a trend in performance.

artificial intelligence, natural language, text processing, (17 more...)

arXiv.org Artificial Intelligence

1105.4318

Country:

Europe (0.30)
Asia > Thailand (0.14)
Asia > India > West Bengal > Kolkata (0.14)
(8 more...)

Genre: Research Report (0.82)

Industry: Government > Regional Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Statistical Machine Translation with Factored Translation Model: MWEs, Separation of Affixes, and Others

Okita, Tsuyoshi (Dublin City University) | Ceausu, Alexandru (Dublin City University) | Way, Andy (Dublin City University)

AAAI ConferencesMay-18-2011

Expressions (MWEs) (Okita et al. 2010), this may improve the overall translation. For example in EN-JP, the empirical evidences 2007; Koehn 2010) intends to handle morphologically rich suggest that we separate affix(es) and word stem(s) since it languages in the target side by integrating additional linguistic obtains better BLEU score than the case when we do not separate markup at the word level, where each type of additional them although the adequacy decreases.

separation, statistical machine translation, translation model, (10 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Europe > Ireland (0.05)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.54)

Add feedback

Given Bilingual Terminology in Statistical Machine Translation: MWE-Sensitve Word Alignment and Hierarchical Pitman-Yor Process-Based Translation Model Smoothing

Okita, Tsuyoshi (Dublin City University) | Way, Andy (Dublin City University)

AAAI ConferencesMay-18-2011

This paper considers a scenario when we are given almost perfect knowledge about bilingual terminology in terms of a test corpus in Statistical Machine Translation (SMT). When the given terminology is part of a training corpus, one natural strategy in SMT is to use the trained translation model ignoring the given terminology. Then, two questions arises here. 1) Can a word aligner capture the given terminology? This is since even if the terminology is in a training corpus, it is often the case that a resulted translation model may not include these terminology. 2) Are probabilities in a translation model correctly calculated? In order to answer these questions, we did experiment introducing a Multi-Word Expression-sensitive (MWE-sensitive) word aligner and a hierarchical Pitman-Yor process-based translation model smoothing. Using 200k JP--EN NTCIR corpus, our experimental results show that if we introduce an MWE-sensitive word aligner and a new translation model smoothing, the overall improvement was 1.35 BLEU point absolute and 6.0% relative compared to the case we do not introduce these two.

knowledge, terminology, translation model, (13 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Ireland (0.04)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Semantic Vector Machines

Vincent, Etter

arXiv.org Artificial IntelligenceMay-14-2011

We first present our work in machine translation, during which we used aligned sentences to train a neural network to embed n-grams of different languages into an $d$-dimensional space, such that n-grams that are the translation of each other are close with respect to some metric. Good n-grams to n-grams translation results were achieved, but full sentences translation is still problematic. We realized that learning semantics of sentences and documents was the key for solving a lot of natural language processing problems, and thus moved to the second part of our work: sentence compression. We introduce a flexible neural network architecture for learning embeddings of words and sentences that extract their semantics, propose an efficient implementation in the Torch framework and present embedding results comparable to the ones obtained with classical neural language models, while being more powerful.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1105.2868

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
(14 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Notes on a New Philosophy of Empirical Science

Burfoot, Daniel

arXiv.org Machine LearningApr-28-2011

This book presents a methodology and philosophy of empirical science based on large scale lossless data compression. In this view a theory is scientific if it can be used to build a data compression program, and it is valuable if it can compress a standard benchmark database to a small size, taking into account the length of the compressor itself. This methodology therefore includes an Occam principle as well as a solution to the problem of demarcation. Because of the fundamental difficulty of lossless compression, this type of research must be empirical in nature: compression can only be achieved by discovering and characterizing empirical regularities in the data. Because of this, the philosophy provides a way to reformulate fields such as computer vision and computational linguistics as empirical sciences: the former by attempting to compress databases of natural images, the latter by attempting to compress large text databases. The book argues that the rigor and objectivity of the compression principle should set the stage for systematic progress in these fields. The argument is especially strong in the context of computer vision, which is plagued by chronic problems of evaluation. The book also considers the field of machine learning. Here the traditional approach requires that the models proposed to solve learning problems be extremely simple, in order to avoid overfitting. However, the world may contain intrinsically complex phenomena, which would require complex models to understand. The compression philosophy can justify complex models because of the large quantity of data being modeled (if the target database is 100 Gb, it is easy to justify a 10 Mb model). The complex models and abstractions learned on the basis of the raw data (images, language, etc) can then be reused to solve any specific learning problem, such as face recognition or machine translation.

artificial intelligence, machine learning, natural language, (25 more...)

arXiv.org Machine Learning

1104.5466

Country:

Europe (0.67)
Asia (0.67)
North America > United States > New York (0.45)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Media (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
(11 more...)

Add feedback

True Knowledge: Open-Domain Question Answering Using Structured Knowledge and Inference

Tunstall-Pedoe, William (True Knowledge Ltd)

AI MagazineOct-10-2010

This article gives a detailed description of True Knowledge: a commercial, open-domain question answering platform. The system combines a large and growing structured knowledge base of common sense, factual and lexical knowledge; a natural language translation system that turns user questions into internal language-independent queries and an inference system that can answer those queries using both directly represented and inferred knowledge. The system is live and answers millions of questions per month asked by internet users.

management and information, open-domain question, question answering, (5 more...)

AI Magazine

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.79)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.73)

Add feedback