AITopics

Country: Asia > South Korea (0.06)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceMar-31-2016, 23:10:54 GMT

Baidu Translate: The Inside Story Slator

Artificial intelligence is on the rise in the world of machine translation. A string of recent news about tech giants bolstering machine translation engines with deep learning underscores just how central integrating deep learning into machine translation products has become for companies like Google and Microsoft. Slator reached out to a representative of Beijing-based Baidu, who is authorized to speak for the company, to get an exclusive look at what the Chinese tech leader has in store for its translation technology. Baidu began R&D on Baidu Translate in 2010, launching the product in June 2011. The company felt that translation was in line with what their search users needed.

baidu translate, machine learning, natural language, (13 more...)

Country: Asia > China > Beijing > Beijing (0.27)

Industry: Information Technology (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

#artificialintelligenceMar-30-2016, 23:15:41 GMT

Microsoft Cognitive Services

Microsoft Translator is a cloud-based automatic text and speech translation (a.k.a.

microsoft cognitive service, natural language, speech recognition, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

#artificialintelligenceMar-25-2016, 20:20:54 GMT

HLTCon 2016

Everyone is looking for the next breakthrough in machine translation. No one believes that machine translation is a completely solved problem. Most people would like to see machine translation systems produce higher quality results. A good translation is one where the meaning of the source is preserved, and it is rendered correctly in the target language. Users expect accuracy on all of the various levels–grammar, syntax, semantics and pragmatics.

artificial intelligence, hltcon 2016, natural language, (3 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

#artificialintelligenceMar-25-2016, 18:35:57 GMT

How IBM, Google, Microsoft, and Amazon do machine learning in the cloud

For any cloud to be taken seriously, it has to meet an ever rising bar of features. Machine learning seems to be on that list, as all the major cloud providers now feature it. But how they go about doing it is another story. Aside from the "curated API vs. open-ended algorithm marketplace" models, there are the "everything and then some vs. just enough" variants. Here's how the four big cloud providers -- IBM, Microsoft, Google, and Amazon -- stack up next to each other in machine learning. When IBM first announced it would turn its Watson AI system into a consumable service, the questions piled up.

cloud computing, machine learning, natural language, (15 more...)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.73)

#artificialintelligenceMar-22-2016, 12:40:49 GMT

Google open-sources machine learning to smarten up our apps

One day in the not-so-distant future, an app might make a dinner reservation for you before you realize you even want to go out, or your smartphone might suggest tourist sights you'd enjoy when you land in a new city. It's possible -- and it's really not so far away, say analysts, who were encouraged today by Google's announcement that it's open sourcing an enhanced machine learning system. The system, dubbed TensorFlow, is smarter, faster and more flexible machine-learning software than Google has ever had before, according to Sundar Pichai, Google's CEO, in a blog post . "Just a couple of years ago, you couldn't talk to the Google app through the noise of a city sidewalk, or read a sign in Russian using Google Translate, or instantly find pictures of your Labradoodle in Google Photos," wrote Pichai. But in a short amount of time they've gotten much, much smarter. Now, thanks to machine learning, you can do all those things pretty easily, and a lot more."

artificial intelligence, machine learning, natural language, (10 more...)

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.38)

Wołk, Krzysztof, Rejmund, Emilia, Marasek, Krzysztof

Multi-domain machine translation enhancements by parallel data extraction from comparable corpora

arXiv.org Machine LearningMar-22-2016

Parallel texts are a relatively rare language resource, however, they constitute a very useful research material with a wide range of applications. This study presents and analyses new methodologies we developed for obtaining such data from previously built comparable corpora. The methodologies are automatic and unsupervised which makes them good for large scale research. The task is highly practical as non-parallel multilingual data occur much more frequently than parallel corpora and accessing them is easy, although parallel sentences are a considerably more useful resource. In this study, we propose a method of automatic web crawling in order to build topic-aligned comparable corpora, e.g. based on the Wikipedia or Euronews.com. We also developed new methods of obtaining parallel sentences from comparable data and proposed methods of filtration of corpora capable of selecting inconsistent or only partially equivalent translations. Our methods are easily scalable to other languages. Evaluation of the quality of the created corpora was performed by analysing the impact of their use on statistical machine translation systems. Experiments were presented on the basis of the Polish-English language pair for texts from different domains, i.e. lectures, phrasebooks, film dialogues, European Parliament proceedings and texts contained medicines leaflets. We also tested a second method of creating parallel corpora based on data from comparable corpora which allows for automatically expanding the existing corpus of sentences about a given domain on the basis of analogies found between them. It does not require, therefore, having past parallel resources in order to train a classifier.

artificial intelligence, corpora, natural language, (16 more...)

arXiv.org Machine Learning

1603.06785

Country:

Asia > Middle East > Republic of Türkiye (0.14)
North America > United States (0.14)
Europe > Bulgaria (0.14)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Machine LearningMar-1-2016

Multi-task Sequence to Sequence Learning

Luong, Minh-Thang, Le, Quoc V., Sutskever, Ilya, Vinyals, Oriol, Kaiser, Lukasz

Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. To date, most of its applications focused on only one task and not much work explored this framework for multiple tasks. This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation. Our results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks. Furthermore, we have established a new state-of-the-art result in constituent parsing with 93.0 F1. Lastly, we reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context: autoencoder helps less in terms of perplexities but more on BLEU scores compared to skip-thought.

machine learning, natural language, translation, (19 more...)

arXiv.org Machine Learning

1511.06114

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Journal of Artificial Intelligence ResearchFeb-23-2016

Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures

Bernardi, Raffaella, Cakici, Ruket, Elliott, Desmond, Erdem, Aykut, Erdem, Erkut, Ikizler-Cinbis, Nazli, Keller, Frank, Muscat, Adrian, Plank, Barbara

Automatic description generation from natural images is a challenging problem that has recently received a large amount of interest from the computer vision and natural language processing communities. In this survey, we classify the existing approaches based on how they conceptualize this problem, viz., models that cast description as either generation problem or as a retrieval problem over a visual or multimodal representational space. We provide a detailed review of existing models, highlighting their advantages and disadvantages. Moreover, we give an overview of the benchmark image datasets and the evaluation measures that have been developed to assess the quality of machine-generated image descriptions.

computer vision, dataset, image description, (11 more...)

doi: 10.1613/jair.4900

AI Access Foundation

10985

Country:

North America > United States > Illinois (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
Europe > Middle East > Malta (0.04)
(7 more...)

Genre: Overview (1.00)

Industry:

Leisure & Entertainment (0.46)
Transportation > Passenger (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
(2 more...)

Tsvetkov, Yulia, Dyer, Chris

Cross-Lingual Bridges with Models of Lexical Borrowing

Journal of Artificial Intelligence ResearchJan-13-2016

Linguistic borrowing is the phenomenon of transferring linguistic constructions (lexical, phonological, morphological, and syntactic) from a donor language to a recipient language as a result of contacts between communities speaking different languages. Borrowed words are found in all languages, andin contrast to cognate relationshipsborrowing relationships may exist across unrelated languages (for example, about 40% of Swahilis vocabulary is borrowed from the unrelated language Arabic). In this work, we develop a model of morpho-phonological transformations across languages. Its features are based on universal constraints from Optimality Theory (OT), and we show that compared to several standardbut linguistically more naïvebaselines, our OT-inspired model obtains good performance at predicting donor forms from borrowed forms with only a few dozen training examples, making this a cost-effective strategy for sharing lexical information across languages. We demonstrate applications of the lexical borrowing model in machine translation, using resource-rich donor language to obtain translations of out-of-vocabulary loanwords in a lower resource language. Our framework obtains substantial improvements (up to 1.6 BLEU) over standard baselines.

constraint, proc, translation, (16 more...)

doi: 10.1613/jair.4786

AI Access Foundation

10975

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
Indian Ocean (0.04)
(7 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)