Goto

Collaborating Authors

 Machine Translation


Automatic Language Identification in Texts: A Survey

Journal of Artificial Intelligence Research

Language identification ("LI") is the problem of determining the natural language that a document or part thereof is written in. Automatic LI has been extensively researched for over fifty years. Today, LI is a key part of many text processing pipelines, as text processing techniques generally assume that the language of the input text is known. Research in this area has recently been especially active. This article provides a brief history of LI research, and an extensive survey of the features and methods used in the LI literature. We describe the features and methods using a unified notation, to make the relationships between methods clearer. We discuss evaluation methods, applications of LI, as well as off-the-shelf LI systems that do not require training by the end user. Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI.


Cat-heavy puzzle game will take you to A.I. school Cult of Mac

#artificialintelligence

Do you want a fun iPhone game that combines cats with a stealth lesson in artificial intelligence and machine learning? And thanks to the oddly titled while True: learn(), you're about to get your chance. Check out the game's new trailer, which landed ahead of this week's release of while True: learn() on iOS. As can be seen from the above trailer, the game's story deals with a cat who's also a master programmer. You set about making a cat-to-human translation program, and wind up developing a whole bunch of other AI tools, too.


Adding multi-language support for Azure AI applications quickly

#artificialintelligence

There is a growing demand for applications which support speech, language identification, translation or transliteration from one language to another. Complex problems such as these can now be solved using advanced APIs that are readily available without having to reinvent the wheel – no machine learning expertise required! This blog starts off with a brief introduction to machine translation and then explores various topics like identifying the language and how to perform translation/transliteration of spoken or typed text using Microsoft's Translator Text API. In addition, we also discuss how translated or transliterated text can be integrated with LIUS. Machine Translation (MT) encompasses the various tasks involved in converting source text from one language to another.


De-mystifying AI and its potential for further application in a B2B context

#artificialintelligence

AI, or Artificial Intelligence, is often demonised and portrayed as some cyborg entity just about ready to take our jobs and eventually kill us all, but more and more businesses, martech and adtech providers are using different AI subsystems each day to advance their services. The term AI is contentiously used to describe a broad spectrum of systems and software's, the controversy arises from where we can begin to describe a machine as being'intelligent' opposed to simply following complex but nonetheless human-reliant algorithms. Regardless of strict definition, there are helpful systems within the subsets of AI which already exist that B2B marketers need to utilise. Machine learning is a subset of AI that can help marketers to improve productivity by taking over mundane tasks, particularly work involving dissecting datasets (like our Argus platform for example). If you're not already using some forms of machine learning, it might be helpful to understand why some sytstems have been reported to increase the productivity of business by 40% (Source: Accenture) and how you can effectively incorporate machine learning into your marketing strategy.


Machine Learning – Introduction to Quick and Accurate Machine Translation Vinod Sharma's Blog

#artificialintelligence

This tool helps to translate one language to another with high accuracy. This post will focus high level arguments around machine translation only to you can find out more details on Machine Learning Basics here. Machine translation (MT) is an automated translation process used by a computer application to translate a natural language text into another. In the translation process, the meaning of the source text must be already stored in the destination i.e. target language. Sounds simple, but on the surface floor, it is far more complex.


Top 10 Applications of Machine Learning Daily Life Applications Edureka

#artificialintelligence

Machine Learning is a buzzword in the technology world right now and for good reason, it represents a major step forward in how computers can learn. The need for Machine Learning Engineers are high in demand and this surge is due to evolving technology and generation of huge amounts of data aka Big Data. On an Average, an ML Engineer can expect a salary of ₹719,646 (IND) or $111,490 (US). So, let's discuss some of the Applications of Machine Learning. I'll be discussing the following Applications of Machine Learning one by one: Now, Google Maps is probably THE app we use whenever we go out and require assistance in directions and traffic.


A Survey of Cross-lingual Word Embedding Models

Journal of Artificial Intelligence Research

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent, modulo optimization strategies, hyper-parameters, and such. We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.


Atlas: A Dataset and Benchmark for E-commerce Clothing Product Categorization

arXiv.org Machine Learning

In E-commerce, it is a common practice to organize the product catalog using product taxonomy. This enables the buyer to easily locate the item they are looking for and also to explore various items available under a category. Product taxonomy is a tree structure with 3 or more levels of depth and several leaf nodes. Product categorization is a large scale classification task that assigns a category path to a particular product. Research in this area is restricted by the unavailability of good real-world datasets and the variations in taxonomy due to the absence of a standard across the different e-commerce stores. In this paper, we introduce a high-quality product taxonomy dataset focusing on clothing products which contain 186,150 images under clothing category with 3 levels and 52 leaf nodes in the taxonomy. We explain the methodology used to collect and label this dataset. Further, we establish the benchmark by comparing image classification and Attention based Sequence models for predicting the category path. Our benchmark model reaches a micro f-score of 0.92 on the test set. The dataset, code and pre-trained models are publicly available at \url{https://github.com/vumaasha/atlas}. We invite the community to improve upon these baselines.


On the Variance of the Adaptive Learning Rate and Beyond

arXiv.org Machine Learning

The learning rate warmup heuristic achieves remarkable success in stabilizing training, accelerating convergence and improving generalization for adaptive stochastic optimization algorithms like RMSprop and Adam. Here, we study its mechanism in details. Pursuing the theory behind warmup, we identify a problem of the adaptive learning rate (i.e., it has problematically large variance in the early stage), suggest warmup works as a variance reduction technique, and provide both empirical and theoretical evidence to verify our hypothesis. We further propose RAdam, a new variant of Adam, by introducing a term to rectify the variance of the adaptive learning rate. Extensive experimental results on image classification, language modeling, and neural machine translation verify our intuition and demonstrate the effectiveness and robustness of our proposed method. All implementations are available at: https://github.com/LiyuanLucasLiu/RAdam.


Tourists to Japan are fueling a boom in personal translation devices

The Japan Times

Takehiko Fujita wouldn't be able to do his job selling eye drops and pain relievers without his pocket translator. Instead of an app, language dictionary or call-in translation service, the clerk in a Japanese drugstore uses Pocketalk, a ¥25,000 ($230) device made by Sourcenext Corp. that looks like an oval puck. The gadget translates phrases to and from 74 languages, helping Fujita communicate with customers from Sweden, Vietnam and other countries. Tourists are flooding into Japan, with 31 million people visiting the archipelago in 2018, triple the number six years earlier, according to the Japan National Tourism Organization. Businesses are struggling with visitors looking to shop, eat and move around -- a situation that will probably worsen during next year's Tokyo Olympics.