Goto

Collaborating Authors

Results


Conversational Agents: Theory and Applications

arXiv.org Artificial Intelligence

In this chapter, we provide a review of conversational agents (CAs), discussing chatbots, intended for casual conversation with a user, as well as task-oriented agents that generally engage in discussions intended to reach one or several specific goals, often (but not always) within a specific domain. We also consider the concept of embodied conversational agents, briefly reviewing aspects such as character animation and speech processing. The many different approaches for representing dialogue in CAs are discussed in some detail, along with methods for evaluating such agents, emphasizing the important topics of accountability and interpretability. A brief historical overview is given, followed by an extensive overview of various applications, especially in the fields of health and education. We end the chapter by discussing benefits and potential risks regarding the societal impact of current and future CA technology.


Neural Speech Synthesis using ForwardTacotron and WaveRNN

#artificialintelligence

Note: this article is intended as a broad and high-level overview of the process of creating a custom speech synthesis pipeline. An in-depth tutorial is planned for a foreseeable future. It was a cold October day as I was casually browsing the web. While wasting my time, not sure of what my next project was going to be, I stumbled upon a speech synthesis idea. The last time I tried messing around with such technology was probably in 2013.


Challenges of Artificial Intelligence -- From Machine Learning and Computer Vision to Emotional Intelligence

arXiv.org Artificial Intelligence

Artificial intelligence (AI) has become a part of everyday conversation and our lives. It is considered as the new electricity that is revolutionizing the world. AI is heavily invested in both industry and academy. However, there is also a lot of hype in the current AI debate. AI based on so-called deep learning has achieved impressive results in many problems, but its limits are already visible. AI has been under research since the 1940s, and the industry has seen many ups and downs due to over-expectations and related disappointments that have followed. The purpose of this book is to give a realistic picture of AI, its history, its potential and limitations. We believe that AI is a helper, not a ruler of humans. We begin by describing what AI is and how it has evolved over the decades. After fundamentals, we explain the importance of massive data for the current mainstream of artificial intelligence. The most common representations for AI, methods, and machine learning are covered. In addition, the main application areas are introduced. Computer vision has been central to the development of AI. The book provides a general introduction to computer vision, and includes an exposure to the results and applications of our own research. Emotions are central to human intelligence, but little use has been made in AI. We present the basics of emotional intelligence and our own research on the topic. We discuss super-intelligence that transcends human understanding, explaining why such achievement seems impossible on the basis of present knowledge,and how AI could be improved. Finally, a summary is made of the current state of AI and what to do in the future. In the appendix, we look at the development of AI education, especially from the perspective of contents at our own university.


Master Artificial Intelligence

#artificialintelligence

Welcome to the comprehensive course on Master Artificial Intelligence Step-by-Step Guide for 2021. R Tutor is a team of software applications training professionals who explain complex information in the simplest form with relevant examples. Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems. Specific applications of AI include expert systems, natural language processing, speech recognition and machine vision. Artificial Intelligence can provide humans a great relief from doing various repetitive tasks.


Task-oriented Dialogue Systems: performance vs. quality-optima, a review

arXiv.org Artificial Intelligence

Task-oriented dialogue systems (TODS) are continuing to rise in popularity as various industries find ways to effectively harness their capabilities, saving both time and money. However, even state-of-the-art TODS are not yet reaching their full potential. TODS typically have a primary design focus on completing the task at hand, so the metric of task-resolution should take priority. Other conversational quality attributes that may point to the success, or otherwise, of the dialogue, may be ignored. This can cause interactions between human and dialogue system that leave the user dissatisfied or frustrated. This paper explores the literature on evaluative frameworks of dialogue systems and the role of conversational quality attributes in dialogue systems, looking at if, how, and where they are utilised, and examining their correlation with the performance of the dialogue system.


Machine learning improves Arabic speech transcription capabilities

MIT Technology Review

Thanks to advancements in speech and natural language processing, there is hope that one day you may be able to ask your virtual assistant what the best salad ingredients are. Currently, it is possible to ask your home gadget to play music, or open on voice command, which is a feature already found in some many devices. If you speak Moroccan, Algerian, Egyptian, Sudanese, or any of the other dialects of the Arabic language, which are immensely varied from region to region, where some of them are mutually unintelligible, it is a different story. If your native tongue is Arabic, Finnish, Mongolian, Navajo, or any other language with high level of morphological complexity, you may feel left out. These complex constructs intrigued Ahmed Ali to find a solution.


Viewpoint: Can AI tutors help students learn?

#artificialintelligence

If nothing else, the past two years have shown us that teaching, learning, and education can take different forms–and the pandemic may have altered how students, from kindergarten through college, learn in the future. With students returning to the classroom, educators and administrators alike continue to examine new ways that technology can be used to not replace, but augment, the teaching and learning experiences in our schools. Conversing with AI humans has been a long-time feature of science fiction, but it's rapidly becoming a reality, particularly in customer service and experience settings as well in education. A realized future with AI is fast approaching. Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems.


An Explicit-Joint and Supervised-Contrastive Learning Framework for Few-Shot Intent Classification and Slot Filling

arXiv.org Artificial Intelligence

Intent classification (IC) and slot filling (SF) are critical building blocks in task-oriented dialogue systems. These two tasks are closely-related and can flourish each other. Since only a few utterances can be utilized for identifying fast-emerging new intents and slots, data scarcity issue often occurs when implementing IC and SF. However, few IC/SF models perform well when the number of training samples per class is quite small. In this paper, we propose a novel explicit-joint and supervised-contrastive learning framework for few-shot intent classification and slot filling. Its highlights are as follows. (i) The model extracts intent and slot representations via bidirectional interactions, and extends prototypical network to achieve explicit-joint learning, which guarantees that IC and SF tasks can mutually reinforce each other. (ii) The model integrates with supervised contrastive learning, which ensures that samples from same class are pulled together and samples from different classes are pushed apart. In addition, the model follows a not common but practical way to construct the episode, which gets rid of the traditional setting with fixed way and shot, and allows for unbalanced datasets. Extensive experiments on three public datasets show that our model can achieve promising performance.


Natural Language Processing for Smart Healthcare

arXiv.org Artificial Intelligence

Smart healthcare has achieved significant progress in recent years. Emerging artificial intelligence (AI) technologies enable various smart applications across various healthcare scenarios. As an essential technology powered by AI, natural language processing (NLP) plays a key role in smart healthcare due to its capability of analysing and understanding human language. In this work we review existing studies that concern NLP for smart healthcare from the perspectives of technique and application. We focus on feature extraction and modelling for various NLP tasks encountered in smart healthcare from a technical point of view. In the context of smart healthcare applications employing NLP techniques, the elaboration largely attends to representative smart healthcare scenarios, including clinical practice, hospital management, personal care, public health, and drug development. We further discuss the limitations of current works and identify the directions for future works.


Pretext Tasks selection for multitask self-supervised speech representation learning

arXiv.org Machine Learning

Through solving pretext tasks, self-supervised learning leverages unlabeled data to extract useful latent representations replacing traditional input features in the downstream task. In various application domains, including computer vision, natural language processing and audio/speech signal processing, a wide range of features where engineered through decades of research efforts. As it turns out, learning to predict such features has proven to be a particularly relevant pretext task leading to building useful self-supervised representations that prove to be effective for downstream tasks. However, methods and common practices for combining such pretext tasks, where each task targets a different group of features for better performance on the downstream task have not been explored and understood properly. In fact, the process relies almost exclusively on a computationally heavy experimental procedure, which becomes intractable with the increase of the number of pretext tasks. This paper introduces a method to select a group of pretext tasks among a set of candidates. The method we propose estimates properly calibrated weights for the partial losses corresponding to the considered pretext tasks during the self-supervised training process. The experiments conducted on speaker recognition and automatic speech recognition validate our approach, as the groups selected and weighted with our method perform better than classic baselines, thus facilitating the selection and combination of relevant pseudo-labels for self-supervised representation learning.