sarikaya
Amazon Alexa scientists Yang Liu and Ruhi Sarikaya named IEEE Fellows
Yang Liu, an Alexa AI principal scientist, and Ruhi Sarikaya, director of applied science, Alexa AI, have recently been named IEEE Fellows. The designation takes effect January 1, 2021. Liu is being honored for her "contributions to speech understanding and language-learning technology", while Sarikaya is being recognized for his "leadership in spoken language processing, and conversational understanding systems". Both currently lead research initiatives focused on making Alexa more natural and conversational, perceptive and context aware, and capable of self learning. The IEEE Fellow designation is conferred by the IEEE board of directors upon individuals with outstanding records of accomplishment in any of the IEEE fields of interest.
Insightful Assistant: AI-compatible Operation Graph Representations for Enhancing Industrial Conversational Agents
Bayrak, Bekir, Giger, Florian, Meurisch, Christian
Advances in voice-controlled assistants paved the way into the consumer market. For professional or industrial use, the capabilities of such assistants are too limited or too time-consuming to implement due to the higher complexity of data, possible AI-based operations, and requests. In the light of these deficits, this paper presents Insightful Assistant---a pipeline concept based on a novel operation graph representation resulting from the intents detected. Using a predefined set of semantically annotated (executable) functions, each node of the operation graph is assigned to a function for execution. Besides basic operations, such functions can contain artificial intelligence (AI) based operations (e.g., anomaly detection). The result is then visualized to the user according to type and extracted user preferences in an automated way. We further collected a unique crowd-sourced set of 869 requests, each with four different variants expected visualization, for an industrial dataset. The evaluation of our proof-of-concept prototype on this dataset shows its feasibility: it achieves an accuracy of up to 95.0% (74.5%) for simple (complex) request detection with different variants and a top3-accuracy up to 95.4% for data-/user-adaptive visualization.
Locale-agnostic Universal Domain Classification Model in Spoken Language Understanding
Lee, Jihwan, Sarikaya, Ruhi, Kim, Young-Bum
In this paper, we introduce an approach for leveraging available data across multiple locales sharing the same language to 1) improve domain classification model accuracy in Spoken Language Understanding and user experience even if new locales do not have sufficient data and 2) reduce the cost of scaling the domain classifier to a large number of locales. We propose a locale-agnostic universal domain classification model based on selective multi-task learning that learns a joint representation of an utterance over locales with different sets of domains and allows locales to share knowledge selectively depending on the domains. The experimental results demonstrate the effectiveness of our approach on domain classification task in the scenario of multiple locales with imbalanced data and disparate domain sets. The proposed approach outperforms other baselines models especially when classifying locale-specific domains and also low-resourced domains.
A Bi-model based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling
Wang, Yu, Shen, Yilin, Jin, Hongxia
Intent detection and slot filling are two main tasks for building a spoken language understanding(SLU) system. Multiple deep learning based models have demonstrated good results on these tasks . The most effective algorithms are based on the structures of sequence to sequence models (or "encoder-decoder" models), and generate the intents and semantic tags either using separate models or a joint model. Most of the previous studies, however, either treat the intent detection and slot filling as two separate parallel tasks, or use a sequence to sequence model to generate both semantic tags and intent. Most of these approaches use one (joint) NN based model (including encoder-decoder structure) to model two tasks, hence may not fully take advantage of the cross-impact between them. In this paper, new Bi-model based RNN semantic frame parsing network structures are designed to perform the intent detection and slot filling tasks jointly, by considering their cross-impact to each other using two correlated bidirectional LSTMs (BLSTM). Our Bi-model structure with a decoder achieves state-of-the-art result on the benchmark ATIS data, with about 0.5$\%$ intent accuracy improvement and 0.9 $\%$ slot filling improvement.
Alexa can now analyze random requests to use skills you didn't know existed
Amazon wants to minimize the impediments its customers face in using its services. A new tool for Alexa skills that is described in Amazon's developer blog enables the smart voice assistant to answer customers' natural language questions. Soon users won't need to query using the prescribed "Alexa-speak." The company's goal is to make Alexa friction-free. Similar to one-click ordering, Amazon Prime, and Amazon Go, removing barriers to customer interaction with Alexa will encourage more engagement.
Amazon Vows To Make Alexa Smarter, Kills Off Eavesdropping Bug
Amazon is about to make its Alexa voice assistant smarter. It has also killed off a bug that allowed Echo smart speakers to eavesdrop even when the devices have been switched off. Earlier this week, the head of the Alexa Machine Learning team, Ruhi Sarikaya, wrote a blog post about the e-commerce giant's plan of making Alexa more friction-free. According to Sarikaya, friction refers to any variable that impedes progress toward a goal and Amazon is "obsessively focused" on eliminating friction from the interactions of customers with the physical and digital world. "[We're] focused on making Alexa smarter and more natural to engage with," Sarikaya said.
Alexa will soon gain a memory, converse more naturally, and automatically launch skills
Alexa will soon be able to recall information you've directed her to remember, as well as have more natural conversations that don't require every command to begin with "Alexa." She'll also be able to launch skills in response to questions you ask, without explicit instructions to do so. The features are the first of what Amazon says are many launches this year that will make its virtual assistant more personalized, smarter, and more engaging. The news was announced this morning in a keynote presentation from the head of the Alexa Brain group, Ruhi Sarikaya, speaking at the World Wide Web Conference in Lyon, France. He explained that the Alexa Brain initiative is focused on improving Alexa's ability to track context and memory within and across dialog sessions, as well as make it easier for users to discover and interact with Alexa's now over 40,000 third-party skills. With the memory update, arriving soon to U.S. users, Alexa will be able to remember any information you ask her to, and retrieve it later.