Machine Translation
Introduction to the Special Issue on Innovative Applications of Artificial Intelligence
We are very pleased to republish here extended versions of a sample of the papers drawn from the Innovative Applications of Artificial Intelligence Conference (IAAI-06), which was held July 17-20, 2006, in Boston, Massachusetts. Three of these articles describe deployed applications and two describe emerging applications. By this measure, artificial intelligence is going strong. Evidence comes from the annual Conference on Innovative Applications of Artificial Intelligence (IAAI), the premier conference on AI applications. The papers presented at the conference provide compelling case studies of the value and impact of AI technology.
Automating Knowledge Acquisition for Machine Translation
Machine translation of human languages (for example, Japanese, English, Spanish) was one of the earliest goals of computer science research, and it remains an elusive one. Like many AI tasks, translation requires an immense amount of knowledge about language and the world. Recent approaches to machine translation frequently make use of text-based learning algorithms to fully or partially automate the acquisition of knowledge. This article illustrates these approaches. Anyone who has taken a graduate-level course in AI knows the answer.
Applying Automated Language Translation at a Global Enterprise Level
This project has continued to evolve with the addition of new languages and improvements to the translation process. However, we discovered that there was a large demand for automated language translation across all of Ford Motor Company and we decided to expand the scope of our project to address these requirements. This article will describe our efforts to meet all of Ford's internal translation requirements with AI and MT technology and focus on the challenges and lessons that we learned from applying advanced technology across an entire corporation. Our initial goal was to utilize MT to translate vehicle build instructions from English to the native languages in the countries and regions where our assembly plants are located. The source text utilized a controlled language that we developed, called Standard Language, and we initially thought that applying MT technology would be a straightforward process. Controlled languages, such as Standard language, restrict the complexity and ambiguity of human languages by restricting syntax and terminology (Huijsen 1998). As such, they have been utilized in a number of different industrial applications (Godden 2000). However, there were many issues dealing with technical terminology, ungrammatical aspects of Standard Language, Ford-specific terminology, and the need to process uncontrolled text that needed to be addressed. We partnered with Systran Software Incorporated and with AppTek (now SAIC) to use their machine-translation technology and also incorporated natural language processing (NLP) algorithms within our artificial intelligence environment to analyze terminology and modify the source text to improve translation accuracy (Rychtyckyj 2007). The need to support manufacturing expansion in non-English speaking countries in Eastern Europe and Asia (such as in Russian and Chinese) led us to add additional language capability and to develop translation glossaries for all of the supported languages. The automated language translation for manufacturing work continues and will expand as Ford's global manufacturing footprint increases.
Applied AINews
The Lockheed Corp. (Calabasas, CA) and AT&T (New York, NY) have signed an agreement to jointly develop and market intelligent transportation systems. The two companies are responding to the Intermodal Surface Transportation Act of 1991, which calls for enhancing roadway capacity, safety, efficiency, and air quality through the development of intelligent vehicle highway systems. Electronic toll collection systems, traffic management systems, in-car navigational and route planning systems are among the systems being developed. UKbased Empires Stores, a mail order company, has reduced the clerical work in its credit department by about 30%, thanks to the implementation of an intelligent system. The company has successfully automated the decision-making process for passing or rejecting orders referred by its performance scoring system.
A Lexical Semantic and Statistical Approach to Lexical Collocation Extraction for Natural Language Generation
These collocations are used by native speakers of a language almost without thought, yet they must be learned by nonnative speakers of the language. A native speaker of English might say that he/she drinks "strong coffee," but a nonnative speaker might say either "powerful coffee" or "sturdy coffee." Collocations tend to vary among languages and topic domains. Unfortunately, the task of correctly identifying lexical collocations, even by native speakers of the language, has been shown to be difficult. Computer systems that translate natural languages, or machine-translation systems, need to know about lexical collocation information to produce natural-sounding or colloquially proper text.
Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation
Zhou, Ganbin, Luo, Ping, Cao, Rongyu, Xiao, Yijun, Lin, Fen, Chen, Bo, He, Qing
Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate plausible responses with less satisfactory relevance and fluency. In this study, we aim to incorporate the results from linguistic analysis into the process of sentence generation for high-quality conversation generation. Specifically, we use a dependency parser to transform each response sentence into a dependency tree and construct a training corpus of sentence-tree pairs. A tree-structured decoder is developed to learn the mapping from a sentence to its tree, where different types of hidden states are used to depict the local dependencies from an internal tree node to its children. For training acceleration, we propose a tree canonicalization method, which transforms trees into equivalent ternary trees. Then, with a proposed tree-structured search method, the model is able to generate the most probable responses in the form of dependency trees, which are finally flattened into sequences as the system output. Experimental results demonstrate that the proposed X2Tree framework outperforms baseline methods over 11.15% increase of acceptance ratio.
One-shot and few-shot learning of word embeddings
Lampinen, Andrew K., McClelland, James L.
Standard deep learning systems require thousands or millions of examples to learn a concept, and cannot integrate new concepts easily. By contrast, humans have an incredible ability to do one-shot or few-shot learning. For instance, from just hearing a word used in a sentence, humans can infer a great deal about it, by leveraging what the syntax and semantics of the surrounding words tells us. Here, we draw inspiration from this to highlight a simple technique by which deep recurrent networks can similarly exploit their prior knowledge to learn a useful representation for a new word from little data. This could make natural language processing systems much more flexible, by allowing them to learn continually from the new words they encounter.
Automated Translation with R and Google Translate API
This course will help you to learn how to use Google translator API. You will learn how to set up your computer to auto translate your files from one to many different languages. We will learn by translating closed captions or *.vtt files but you can translate any other text. If you have subtitles files for your videos which you want to auto-translate to many different languages then it's the course for you! You will be able to translate those files right away. We will use R software as our programming environment which will allow us to achieve our goal with minimum effort possible. This course is designed for you to quickly achieve your goal to know how to setup your computer in order to automatically translate your text. Join this course because you will get all these additional benefits: You learn to setup Google Cloud Platform API key You will learn how to encrypt and securely use your API key with your scripts You will learn strings manipulations in R You will automate your translating tasks Finally you will be able to make computer doing all job even when you are sleeping!!! Google Translate API is a paid service, however you can use your bonus credit from Google to start learning and applying your knowledge.
A Gentle Introduction to Neural Machine Translation - Machine Learning Mastery
One of the earliest goals for computers was the automatic translation of text from one language to another. Automatic or machine translation is perhaps one of the most challenging artificial intelligence tasks given the fluidity of human language. Classically, rule-based systems were used for this task, which were replaced in the 1990s with statistical methods. More recently, deep neural network models achieve state-of-the-art results in a field that is aptly named neural machine translation. In this post, you will discover the challenge of machine translation and the effectiveness of neural machine translation models.
Neural Program Meta-Induction
Devlin, Jacob, Bunel, Rudy R., Singh, Rishabh, Hausknecht, Matthew, Kohli, Pushmeet
Most recently proposed methods for Neural Program induction work under the assumption of having a large set of input/output (I/O) examples for learning any given input-output mapping. This paper aims to address the problem of data and computation efficiency of program induction by leveraging information from related tasks. Specifically, we propose two novel approaches for cross-task knowledge transfer to improve program induction in limited-data scenarios. In our first proposal, portfolio adaptation, a set of induction models is pretrained on a set of related tasks, and the best model is adapted towards the new task using transfer learning. In our second approach, meta program induction, a $k$-shot learning approach is used to make a model generalize to new tasks without additional training. To test the efficacy of our methods, we constructed a new benchmark of programs written in the Karel programming language. Using an extensive experimental evaluation on the Karel benchmark, we demonstrate that our proposals dramatically outperform the baseline induction method that does not use knowledge transfer. We also analyze the relative performance of the two approaches and study conditions in which they perform best. In particular, meta induction outperforms all existing approaches under extreme data sparsity (when a very small number of examples are available), i.e., fewer than ten. As the number of available I/O examples increase (i.e. a thousand or more), portfolio adapted program induction becomes the best approach. For intermediate data sizes, we demonstrate that the combined method of adapted meta program induction has the strongest performance.