Goto

Collaborating Authors

 Overview


From jobs to superjobs

#artificialintelligence

The use of artificial intelligence (AI), cognitive technologies, and robotics to automate and augment work is on the rise, prompting the redesign of jobs in a growing number of domains. The jobs of today are more machine-powered and data-driven than in the past, and they also require more human skills in problem-solving, communication, interpretation, and design. As machines take over repeatable tasks and the work people do becomes less routine, many jobs will rapidly evolve into what we call "superjobs"--the newest job category that changes the landscape of how organizations think about work. During the last few years, many have been alarmed by studies predicting that AI and robotics will do away with jobs. In 2019, this topic remains very much a concern among our Global Human Capital Trends survey respondents.


Multiscale Principle of Relevant Information for Hyperspectral Image Classification

arXiv.org Machine Learning

This paper proposes a novel architecture, termed multiscale principle of relevant information (MPRI), to learn discriminative spectral-spatial features for hyperspectral image (HSI) classification. MPRI inherits the merits of the principle of relevant information (PRI) to effectively extract multiscale information embedded in the given data, and also takes advantage of the multilayer structure to learn representations in a coarse-to-fine manner. Specifically, MPRI performs spectral-spatial pixel characterization (using PRI) and feature dimensionality reduction (using regularized linear discriminant analysis) iteratively and successively. Extensive experiments on four benchmark data sets demonstrate that MPRI outperforms existing state-of-the-art HSI classification methods (including deep learning based ones) qualitatively and quantitatively, especially in the scenario of limited training samples. I. INTRODUCTION With the rapid development of hyperspectral imaging techniques, current sensors always have high spectral and spatial resolution [1]. This work was supported in part by the National Natural Science Foundation of China (Grant No. 61502195), and in part by the Office of Naval Research Science of Autonomy (Grant No. N000141812306). Yantao Wei is with School of Educational Information Technology, Central China Normal University, Wuhan 430079, China (email: yantaowei@mail.ccnu.edu.cn).


Point-Based Value Iteration for Finite-Horizon POMDPs

Journal of Artificial Intelligence Research

Partially Observable Markov Decision Processes (POMDPs) are a popular formalism for sequential decision making in partially observable environments. Since solving POMDPs to optimality is a difficult task, point-based value iteration methods are widely used. These methods compute an approximate POMDP solution, and in some cases they even provide guarantees on the solution quality, but these algorithms have been designed for problems with an infinite planning horizon. In this paper we discuss why state-of-the-art point-based algorithms cannot be easily applied to finite-horizon problems that do not include discounting. Subsequently, we present a general point-based value iteration algorithm for finite-horizon problems which provides solutions with guarantees on solution quality. Furthermore, we introduce two heuristics to reduce the number of belief points considered during execution, which lowers the computational requirements. In experiments we demonstrate that the algorithm is an effective method for solving finite-horizon POMDPs.


Artificial Intelligence as a Services (AI-aaS) on Software-Defined Infrastructure

arXiv.org Artificial Intelligence

This paper investigates a paradigm for offering artificial intelligence as a service (AI-aaS) on software-defined infrastructures (SDIs). The increasing complexity of networking and computing infrastructures is already driving the introduction of automation in networking and cloud computing management systems. Here we consider how these automation mechanisms can be leveraged to offer AI-aaS. Use cases for AI-aaS are easily found in addressing smart applications in sectors such as transportation, manufacturing, energy, water, air quality, and emissions. We propose an architectural scheme based on SDIs where each AI-aaS application is comprised of a monitoring, analysis, policy, execution plus knowledge (MAPE-K) loop (MKL). Each application is composed as one or more specific service chains embedded in SDI, some of which will include a Machine Learning (ML) pipeline. Our model includes a new training plane and an AI-aaS plane to deal with the model-development and operational phases of AI applications. We also consider the role of an ML/MKL sandbox in ensuring coherency and consistency in the operation of multiple parallel MKL loops. We present experimental measurement results for three AI-aaS applications deployed on the SAVI testbed: 1. Compressing monitored data in SDI using autoencoders; 2. Traffic monitoring to allocate CPUs resources to VNFs; and 3. Highway segment classification in smart transportation.


DeepXDE: A deep learning library for solving differential equations

arXiv.org Machine Learning

Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network using automatic differentiation. The PINN algorithm is simple, and it can be applied to different types of PDEs, including integro-differential equations, fractional PDEs, and stochastic PDEs. Moreover, PINNs solve inverse problems as easily as forward problems. We propose a new residual-based adaptive refinement (RAR) method to improve the training efficiency of PINNs. For pedagogical reasons, we compare the PINN algorithm to a standard finite element method. We also present a Python library for PINNs, DeepXDE, which is designed to serve both as an education tool to be used in the classroom as well as a research tool for solving problems in computational science and engineering. DeepXDE supports complex-geometry domains based on the technique of constructive solid geometry, and enables the user code to be compact, resembling closely the mathematical formulation. We introduce the usage of DeepXDE and its customizability, and we also demonstrate the capability of PINNs and the user-friendliness of DeepXDE for five different examples. More broadly, DeepXDE contributes to the more rapid development of the emerging Scientific Machine Learning field.


Improving the Performance of the LSTM and HMM Models via Hybridization

arXiv.org Machine Learning

Language modelling has been an integral part of providing an understanding of the nature of language to capture its meaning. In order to improve the machine understanding of language using sequential models, we seek to explore two prominent areas of statistical language models, the Hidden Markov Model (HMM), and a Recurrent Neural Network (RNN) architecture, known commonly as Long Short-Term Memory (LSTM). Under a discrete stochastic modelling framework, HMM's were first introduced in Rabiner [1] to classify speech signals. First used to automate AT&T's voice activated call center, the revolutionary technology allowed computers to robustly characterise speech, and form a basic understanding of spoken words. HMM's have since become a definitive benchmark for the state-of-the-art for speech recognition, and text recognition. Around the same period, RNN's were introduced by Rumelhart et al. [2], however, the training complexity of the model was far too high and not commensurate with the hardware capabilities at the time. In the 21st century, With the introduction of more advanced hardware for deep learning model training, came a wave of applications for the RNN for both voice, text recognition, [3], [4], [5] and machine translation [6]. In parallel, an early form of neural language model was developed in Bengio et al. [7], displaying promising results in statistical language modelling. LSTM's were the first introduced in Hochreiter and Schmidhuber [8], specifically to combat the vanishing gradient problem, which will be further addressed in Section 1.2.


Goal Recognition Design in Deterministic Environments

Journal of Artificial Intelligence Research

Goal recognition design (GRD) facilitates understanding the goals of acting agents through the analysis and redesign of goal recognition models, thus offering a solution for assessing and minimizing the maximal progress of any agent in the model before goal recognition is guaranteed. In a nutshell, given a model of a domain and a set of possible goals, a solution to a GRD problem determines (1) the extent to which actions performed by an agent within the model reveal the agentโ€™s objective; and (2) how best to modify the model so that the objective of an agent can be detected as early as possible. This approach is relevant to any domain in which rapid goal recognition is essential and the model design can be controlled. Applications include intrusion detection, assisted cognition, computer games, and human-robot collaboration. A GRD problem has two components: the analyzed goal recognition setting, and a design model specifying the possible ways the environment in which agents act can be modified so as to facilitate recognition. This work formulates a general framework for GRD in deterministic and partially observable environments, and offers a toolbox of solutions for evaluating and optimizing model quality for various settings. For the purpose of evaluation we suggest the worst case distinctiveness (WCD) measure, which represents the maximal cost of a path an agent may follow before its goal can be inferred by a goal recognition system. We offer novel compilations to classical planning for calculating WCD in settings where agents are bounded-suboptimal. We then suggest methods for minimizing WCD by searching for an optimal redesign strategy within the space of possible modifications, and using pruning to increase efficiency. We support our approach with an empirical evaluation that measures WCD in a variety of GRD settings and tests the efficiency of our compilation-based methods for computing it. We also examine the effectiveness of reducing WCD via redesign and the performance gain brought about by our proposed pruning strategy.


Procedural Content Generation through Quality Diversity

arXiv.org Artificial Intelligence

We propose, therefore, procedural content 1984) in the 1980s, certain genres of digital games have generation through quality-diversity (PCG-QD) as a subset relied on algorithmic processes to generate content such as of search-based procedural content generation [4] which is levels, weapons, personalities, quests, etc. Throughout its long perfectly suited for generating content autonomously (as it can history, procedural content generation (PCG) has aimed to produce a large set of diverse and high-quality artifacts in one provide content that is playable, of a high quality, and yet run, even in search spaces which are not well-defined) or with different from other content that came before or after. On the a human designer (as it can explain and express its artifacts' one hand, most game content need to satisfy certain minimal desirable properties). This paper presents the components of criteria on playability (such as the exit in a dungeon being quality-diversity algorithms, identifies the strengths of PCG-reachable by the player) while they also need to be entertaining QD over popular alternatives, surveys recent work in this vein and challenging (which are softer and often subjective quality and attempts to map out the road ahead.


Computer-Aided Data Mining: Automating a Novel Knowledge Discovery and Data Mining Process Model for Metabolomics

arXiv.org Machine Learning

This work presents MeKDDaM-SAGA, computer-aided automation software for implementing a novel knowledge discovery and data mining process model that was designed for performing justifiable, traceable and reproducible metabolomics data analysis. The process model focuses on achieving metabolomics analytical objectives and on considering the nature of its involved data. MeKDDaM-SAGA was successfully used for guiding the process model execution in a number of metabolomics applications. It satisfies the requirements of the proposed process model design and execution. The software realises the process model layout, structure and flow and it enables its execution externally using various data mining and machine learning tools or internally using a number of embedded facilities that were built for performing a number of automated activities such as data preprocessing, data exploration, data acclimatization, modelling, evaluation and visualization. MeKDDaM-SAGA was developed using object-oriented software engineering methodology and was constructed in Java. It consists of 241 design classes that were designed to implement 27 use-cases. The software uses an XML database to guarantee portability and uses a GUI interface to ensure its user-friendliness. It implements an internal embedded version control system that is used to realise and manage the process flow, feedback and iterations and to enable undoing and redoing the execution of the process phases, activities, and the internal tasks within its phases.


Implementation of batched Sinkhorn iterations for entropy-regularized Wasserstein loss

arXiv.org Machine Learning

In this report, we review the calculation of entropy-regularised Wasserstein loss introduced by Cuturi and document a practical implementation in PyTorch. Recently the Wasserstein distance has seen new applications in machine learning and deep learning. It commonly replaces the Kullback-Leibler divergence (also often dubbed cross-entropy loss in the Deep Learning context). In contrast to the latter, Wasserstein distances not only consider the values probability distribution or density at any given point, but also incorporating spatial information in terms of the underlying metric regarding these differences. Intuitively, it yields a smaller distance if probability mass moved to a nearby point or region and a larger distance if probability mass moved far away.