Goto

Collaborating Authors

Automatic Language Identification in Texts: A Survey

Journal of Artificial Intelligence Research

Language identification (“LI”) is the problem of determining the natural language that a document or part thereof is written in. Automatic LI has been extensively researched for over fifty years. Today, LI is a key part of many text processing pipelines, as text processing techniques generally assume that the language of the input text is known. Research in this area has recently been especially active. This article provides a brief history of LI research, and an extensive survey of the features and methods used in the LI literature. We describe the features and methods using a unified notation, to make the relationships between methods clearer. We discuss evaluation methods, applications of LI, as well as off-the-shelfLI systems that do not require training by the end user. Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI.


Thinking Deeply to Make Better Speech

Communications of the ACM

A humanoid robot, named Aiko Chihira by its creators at Toshiba and Osaka University, at a 2015 trial in Tokyo's Mitsukoshi department store. Toshiba says it will incorporate speech recognition and synthesis into the robot by 2020. Machines that speak are nothing new. Siri has been answering questions from iPhone users since 2011, and text-to-voice programs have been around even longer. People with speaking disabilities--most famously, Stephen Hawking--have used computers to generate speech for decades.


Large expert-curated database for benchmarking document similarity detection in biomedical literature search

#artificialintelligence

Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations.


iPhone Siri prank tricks people into phoning emergency services

The Independent - Tech

A prank designed to trick iPhone users into calling emergency services is currently spreading online. Owners of Apple products are being encouraged to say '108' to Siri, unaware of the consequences. As 108 is the equivalent of 999 in India, the digital assistant will recognise it as a command to contact emergency services in the phone user's local area. The giant human-like robot bears a striking resemblance to the military robots starring in the movie'Avatar' and is claimed as a world first by its creators from a South Korean robotic company Waseda University's saxophonist robot WAS-5, developed by professor Atsuo Takanishi and Kaptain Rock playing one string light saber guitar perform jam session A man looks at an exhibit entitled'Mimus' a giant industrial robot which has been reprogrammed to interact with humans during a photocall at the new Design Museum in South Kensington, London Electrification Guru Dr. Wolfgang Ziebart talks about the electric Jaguar I-PACE concept SUV before it was unveiled before the Los Angeles Auto Show in Los Angeles, California, U.S The Jaguar I-PACE Concept car is the start of a new era for Jaguar. Japan's On-Art Corp's CEO Kazuya Kanemaru poses with his company's eight metre tall dinosaur-shaped mechanical suit robot'TRX03' and other robots during a demonstration in Tokyo, Japan Japan's On-Art Corp's eight metre tall dinosaur-shaped mechanical suit robot'TRX03' performs during its unveiling in Tokyo, Japan Singulato Motors co-founder and CEO Shen Haiyin poses in his company's concept car Tigercar P0 at a workshop in Beijing, China A picture shows Singulato Motors' concept car Tigercar P0 at a workshop in Beijing, China Connected company president Shigeki Tomoyama addresses a press briefing as he elaborates on Toyota's "connected strategy" in Tokyo.


A Bayesian Additive Model for Understanding Public Transport Usage in Special Events

arXiv.org Machine Learning

Public special events, like sports games, concerts and festivals are well known to create disruptions in transportation systems, often catching the operators by surprise. Although these are usually planned well in advance, their impact is difficult to predict, even when organisers and transportation operators coordinate. The problem highly increases when several events happen concurrently. To solve these problems, costly processes, heavily reliant on manual search and personal experience, are usual practice in large cities like Singapore, London or Tokyo. This paper presents a Bayesian additive model with Gaussian process components that combines smart card records from public transport with context information about events that is continuously mined from the Web. We develop an efficient approximate inference algorithm using expectation propagation, which allows us to predict the total number of public transportation trips to the special event areas, thereby contributing to a more adaptive transportation system. Furthermore, for multiple concurrent event scenarios, the proposed algorithm is able to disaggregate gross trip counts into their most likely components related to specific events and routine behavior. Using real data from Singapore, we show that the presented model outperforms the best baseline model by up to 26% in R2 and also has explanatory power for its individual components.