Country
ScopeIt: Scoping Task Relevant Sentences in Documents
Suryanarayanan, Vishwas, Patra, Barun, Bhattacharya, Pamela, Fufa, Chala, Lee, Charles
Intelligent assistants like Cortana, Siri, Alexa, and Google Assistant are trained to parse information when the conversation is synchronous and short; however, for email-based conversational agents, the communication is asynchronous, and often contains information irrelevant to the assistant. This makes it harder for the system to accurately detect intents, extract entities relevant to those intents and thereby perform the desired action. We present a neural model for scoping relevant information for the agent from a large query. We show that when used as a preprocessing step, the model improves performance of both intent detection and entity extraction tasks. We demonstrate the model's impact on Scheduler (Cortana is the persona of the agent, while Scheduler is the name of the service. We use them interchangeably in the context of this paper.) - a virtual conversational meeting scheduling assistant that interacts asynchronously with users through email. The model helps the entity extraction and intent detection tasks requisite by Scheduler achieve an average gain of 35% in precision without any drop in recall. Additionally, we demonstrate that the same approach can be used for component level analysis in large documents, such as signature block identification.
Development of an Expert System for Diabetic Type-2 Diet
Ahmed, Ibrahim M., Mahmoud, Abeer M.
A successful intelligent control of patient food for treatment purpose must combines patient interesting food list and doctors efficient treatment food list. Actually, many rural communities in Sudan have extremely limited access to diabetic diet centers. People travel long distances to clinics or medical facilities, and there is a shortage of medical experts in most of these facilities. This results in slow service, and patients end up waiting long hours without receiving any attention. Hence diabetic diet expert systems can play a significant role in such cases where medical experts are not readily available. This paper presents the design and implementation of an intelligent medical expert system for diabetes diet that intended to be used in Sudan. The development of the proposed expert system went through a number of stages such problem and need identification, requirements analysis, knowledge acquisition, formalization, design and implementation. Visual prolog was used for designing the graphical user interface and the implementation of the system. The proposed expert system is a promising helpful tool that reduces the workload for physicians and provides diabetics with simple and valuable assistance.
A Multi-view Perspective of Self-supervised Learning
Geng, Chuanxing, Tan, Zhenghao, Chen, Songcan
As a newly emerging unsupervised learning paradigm, self-supervised learning (SSL) recently gained widespread attention, which usually introduces a pretext task without manual annotation of data. With its help, SSL effectively learns the feature representation beneficial for downstream tasks. Thus the pretext task plays a key role. However, the study of its design, especially its essence currently is still open. In this paper, we borrow a multi-view perspective to decouple a class of popular pretext tasks into a combination of view data augmentation (VDA) and view label classification (VLC), where we attempt to explore the essence of such pretext task while providing some insights into its design. Specifically, a simple multi-view learning framework is specially designed (SSL-MV), which assists the feature learning of downstream tasks (original view) through the same tasks on the augmented views. SSL-MV focuses on VDA while abandons VLC, empirically uncovering that it is VDA rather than generally considered VLC that dominates the performance of such SSL. Additionally, thanks to replacing VLC with VDA tasks, SSL-MV also enables an integrated inference combining the predictions from the augmented views, further improving the performance. Experiments on several benchmark datasets demonstrate its advantages.
A Novel Decision Tree for Depression Recognition in Speech
Liu, Zhenyu, Wang, Dongyu, Zhang, Lan, Hu, Bin
Depression is a common mental disorder worldwide which causes a range of serious outcomes. The diagnosis of depression relies on patient-reported scales and psychiatrist interview which may lead to subjective bias. In recent years, more and more researchers are devoted to depression recognition in speech , which may be an effective and objective indicator. This study proposes a new speech segment fusion method based on decision tree to improve the depression recognition accuracy and conducts a validation on a sample of 52 subjects (23 depressed patients and 29 healthy controls). The recognition accuracy are 75.8% and 68.5% for male and female respectively on gender-dependent models. It can be concluded from the data that the proposed decision tree model can improve the depression classification performance.
Effective End-to-End Learning Framework for Economic Dispatch
Lu, Chenbei, Wang, Kui, Wu, Chenye
Conventional wisdom to improve the effectiveness of economic dispatch is to design the load forecasting method as accurately as possible. However, this approach can be problematic due to the temporal and spatial correlations between system cost and load prediction errors. This motivates us to adopt the notion of end-to-end machine learning and to propose a task-specific learning criteria to conduct economic dispatch. Specifically, to maximize the data utilization, we design an efficient optimization kernel for the learning process. We provide both theoretical analysis and empirical insights to highlight the effectiveness and efficiency of the proposed learning framework.
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks
She, Changyang, Dong, Rui, Gu, Zhouyou, Hou, Zhanwei, Li, Yonghui, Hardjawana, Wibowo, Yang, Chenyang, Song, Lingyang, Vucetic, Branka
In the future 6th generation networks, ultra-reliable and low-latency communications (URLLC) will lay the foundation for emerging mission-critical applications that have stringent requirements on end-to-end delay and reliability. Existing works on URLLC are mainly based on theoretical models and assumptions. The model-based solutions provide useful insights, but cannot be directly implemented in practice. In this article, we first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC, and discuss some open problems of these methods. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC. The basic idea is to merge theoretical models and real-world data in analyzing the latency and reliability and training deep neural networks (DNNs). Deep transfer learning is adopted in the architecture to fine-tune the pre-trained DNNs in non-stationary networks. Further considering that the computing capacity at each user and each mobile edge computing server is limited, federated learning is applied to improve the learning efficiency. Finally, we provide some experimental and simulation results and discuss some future directions.
Regression with Deep Learning for Sensor Performance Optimization
Vaila, Ruthvik, Lloyd, Denver, Tetz, Kevin
Neural networks with at least two hidden layers are called deep networks. Recent developments in AI and computer programming in general has led to development of tools such as Tensorflow, Keras, NumPy etc. making it easier to model and draw conclusions from data. In this work we re-approach non-linear regression with deep learning enabled by Keras and Tensorflow. In particular, we use deep learning to parametrize a non-linear multivariate relationship between inputs and outputs of an industrial sensor with an intent to optimize the sensor performance based on selected key metrics.
Performance Analysis of Combine Harvester using Hybrid Model of Artificial Neural Networks Particle Swarm Optimization
Nadai, Laszlo, Imre, Felde, Ardabili, Sina, Gundoshmian, Tarahom Mesri, Gergo, Pinter, Mosavi, Amir
Novel applications of artificial intelligence for tuning the parameters of industrial machines for optimal performance are emerging at a fast pace. Tuning the combine harvesters and improving the machine performance can dramatically minimize the wastes during harvesting, and it is also beneficial to machine maintenance. Literature includes several soft computing, machine learning and optimization methods that had been used to model the function of harvesters of various crops. Due to the complexity of the problem, machine learning methods had been recently proposed to predict the optimal performance with promising results. In this paper, through proposing a novel hybrid machine learning model based on artificial neural networks integrated with particle swarm optimization (ANN-PSO), the performance analysis of a common combine harvester is presented. The hybridization of machine learning methods with soft computing techniques has recently shown promising results to improve the performance of the combine harvesters. This research aims at improving the results further by providing more stable models with higher accuracy.
Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games
Lin, Tianyi, Zhou, Zhengyuan, Mertikopoulos, Panayotis, Jordan, Michael I.
We consider multi-agent learning via online gradient descent (OGD) in a class of games called $\lambda$-cocoercive games, a broad class of games that admits many Nash equilibria and that properly includes strongly monotone games. We characterize the finite-time last-iterate convergence rate for joint OGD learning on $\lambda$-cocoercive games; further, building on this result, we develop a fully adaptive OGD learning algorithm that does not require any knowledge of the problem parameter (e.g., the cocoercive constant $\lambda$) and show, via a novel double-stopping-time technique, that this adaptive algorithm achieves the same finite-time last-iterate convergence rate as its non-adaptive counterpart. Subsequently, we extend OGD learning to the noisy gradient feedback case and establish last-iterate convergence results---first qualitative almost sure convergence, then quantitative finite-time convergence rates---all under non-decreasing step-sizes. These results fill in several gaps in the existing multi-agent online learning literature, where three aspects---finite-time convergence rates, non-decreasing step-sizes, and fully adaptive algorithms---have not been previously explored.
Periodic Q-Learning
The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited. In this paper, we study the so-called periodic Q-learning algorithm (PQ-learning for short), which resembles the technique used in deep Q-learning for solving infinite-horizon discounted Markov decision processes (DMDP) in the tabular setting. PQ-learning maintains two separate Q-value estimates - the online estimate and target estimate. The online estimate follows the standard Q-learning update, while the target estimate is updated periodically. In contrast to the standard Q-learning, PQ-learning enjoys a simple finite time analysis and achieves better sample complexity for finding an epsilon-optimal policy. Our result provides a preliminary justification of the effectiveness of utilizing target estimates or networks in Q-learning algorithms.