Collaborating Authors

Natural Language

Now all your home's Alexa devices work like an intercom


Amazon Alexa users can now use the "Drop In" feature to talk with all of their Echo devices at once, Amazon announced on its blog. Previously, Drop In messages could only be sent to one other Alexa-enabled device at a time -- a user with an Alexa device in the bedroom could "drop in" on a device in the kitchen and have a two-way conversation. Now, you can use a device to send a message to all Echo devices in the house at once. This could be helpful with asking group questions like, "Does anyone want anything from the grocery store?" according to the Amazon blog. To start a group Drop In conversation, you can ask Alexa to "Drop In everywhere."

SDL Partners with DRUID to Power Multilingual Chatbot Conversations SDL


SDL (LSE: SDL), the intelligent language and content company, announces it has entered into a technical partnership with DRUID, specialists in conversational AI, to launch multi-lingual virtual assistants for enterprise organizations that enable real-time communication through chatbots. By integrating SDL Machine Translation with DRUID virtual assistants, companies will be able to conduct chatbot conversations in different languages with employees, customers, partners and suppliers. The solution offers a real-time "interpreter mode" function, which can translate conversations along with "live chat" which can translate into multiple languages in real-time. This supports the need for customers to easily translate entire conversations as well as enable scenarios where an agent – human or virtual – needs to communicate across multiple languages simultaneously. Chatbots are commonly configured to undergo complicated question-and-answering activities in different languages, but language-specific customization can be complex, time-consuming and costly.

Binary classification with NLP discriminant power analysis


Natural language processing is a recurrent topic about machine learning, and there are many ways to deal with it. In this topic I will focus on the discriminant power analysis, a very interesting data featuring method for binary classification. This method consists in finding the most discriminant words between two classes of target. This morphological approach is interesting as, despite a low complexity, it gives good results. For this article I will detail a full example from preprocessing to modelling and prediction with the spam data set available on kaggle.

Hot papers on arXiv from the past month – May 2020


Here are the most tweeted papers that were uploaded onto arXiv during May 2020. Results are powered by Arxiv Sanity Preserver. Abstract: We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor generation that explicitly encode our prior knowledge about the task. The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss that forces unique predictions via bipartite matching, and a transformer encoder-decoder architecture.

OpenAI's gigantic GPT-3 hints at the limits of language models for AI


A little over a year ago, OpenAI, an artificial intelligence company based in San Francisco, stunned the world by showing a dramatic leap in what appeared to be the power of the computers to form natural-language sentences, and even to solve questions, such as completing a sentence, and formulating long passages of text people found fairly human. The latest work from that team shows how OpenAI's thinking has matured in some respects. GPT-3, as the newest creation is called, emerged last week, with more bells and whistles, created by some of the same authors as the last version, including Alec Radford and Ilya Sutskever, along with several additional collaborators, including scientists from Johns Hopkins University. It is now a truly monster language model, as its called, gobbling two orders of magnitude more text than its predecessor. But within that bigger-is-better stunt, the OpenAI team seem to be approaching some truths much the way Dr. David Bowman seems to approach the limits of the known at the end of the movie 2001.

Arlo Video Doorbell now takes commands from Google Assistant


You no longer have to live in an Amazon-focused household for Arlo's Video Doorbell to make the most sense. Arlo has introduced Google Assistant support to deliver notifications and send commands. If you're worried about the ruckus outside, you can ask Google to "show me the front door" and get a video feed sent to a smart display like the Nest Hub Max. The Video Doorbell normally sells for $150. That's more than rivals like Ring's new starter doorbell, but it gives you the choice of both Alexa and Google Assistant.

Language-Based Interfaces and Their Application for Cultural Tourism

AI Magazine

Language processing has a large practical potential in intelligent interfaces if we take into account multiple modalities of communication. Multi-modality refers to the perception of different coordinated media used in delivering a message as well as the combination of various attitudes in relation to communication. In particular, the integration of natural language processing and hypermedia allows each modality to overcome the constraints of the other, resulting in a novel class of integrated environments for complex exploration and information access. Information presentation is a key element of such environments; generation techniques can contribute to their quality by producing texts ex novo or flexibly adapting existing material to the current situation. A great opportunity arises for intelligent interfaces and language technology of this kind to play an important role for individual-oriented cultural tourism.

LifeCode: A Deployed Application for Automated Medical Coding

AI Magazine

LifeCode is a natural language processing (NLP) and expert system that extracts demographic and clinical information from free-text clinical records. The initial application of LifeCode is for the emergency medicine clinical specialty. An application for diagnostic radiology went into production in October 2000. The LifeCode NLP engine uses a large number of specialist readers whose particular output are combined at various levels to form an integrated picture of the patient's medical condition(s), course of treatment, and disposition. The LifeCode expert system performs the tasks of combining complementary information, deleting redundant information, assessing the level of medical risk and level of service represented in the clinical record, and producing an output that is appropriate for input to an electronic medical record (EMR) system or a hospital information system.

Embodied Conversational Agents: Representation and Intelligence in User Interfaces

AI Magazine

How do we decide how to represent an intelligent system in its interface, and how do we decide how the interface represents information about the world and about its own workings to a user? The rubric representation covers at least three topics in this context: (1) how a computational system is represented in its user interface, (2) how the interface conveys its representations of information and the world to human users, and (3) how the system's internal representation affects the human user's interaction with the system. I argue that each of these kinds of representation (of the system, information and the world, the interaction) is key to how users make the kind of attributions of intelligence that facilitate their interactions with intelligent systems. In this vein, it makes sense to represent a systmem as a human in those cases where social collaborative behavior is key and for the system to represent its knowledge to humans in multiple ways on multiple modalities. I demonstrate these claims by discussing issues of representation and intelligence in an embodied conversational agent -- an interface in which the system is represented as a person, information is conveyed to human users by multiple modalities such as voice and hand gestures, and the internal representation is modality independent and both propositional and nonpropositional.

Machine Translation for Manufacturing: A Case Study at Ford Motor Company

AI Magazine

Machine translation (MT) was one of the first applications of artificial intelligence technology that was deployed to solve real-world problems. Since the early 1960s, researchers have been building and utilizing computer systems that can translate from one language to another without requiring extensive human intervention. In the late 1990s, Ford Vehicle Operations began working with Systran Software Inc. to adapt and customize its machine-translation technology in order to translate Ford's vehicle assembly build instructions from English to German, Spanish, Dutch, and Portuguese. The use of machine translation was made necessary by the vast amount of dynamic information that needed to be translated in a timely fashion. The assembly build instructions at Ford contain text written in a controlled language as well as unstructured remarks and comments.