Goto

Collaborating Authors

 Undirected Networks


Optimal quantum mixing for slowly evolving sequences of Markov chains

arXiv.org Artificial Intelligence

Quantum walks are the quantum counterpart of classical random walks and, correspondingly, many classical algorithms which are based upon classical random walks naturally lend themselves to be quantized. This quantization, either in the form of coined quantum walks [1-4] or using the so called Szegedy walk operator [5], allows to exploit coherence and entanglement to speedup many computational tasks, with improvements with respect to classical algorithms that are typically quadratic or, in some special circumstances, exponential [6, 7]. Henceforth quantum walks have been employed in a variety of quantum algorithms with applications ranging from statistical physics [8, 9], to combinatorial optimization problems [10], to machine learning [11]. The primary motivation for the quesiton we study in this work is the theory of quantum walks originated from their applicability to quantum machine learning, as demonstrated by the results of [11]. Therein it was shown that it is possible to achieve quadratic speedups in the deliberation time of a learning agent by exploitation of Szegedy quantum walks in conjunction with amplitude amplification. However, together with the certifiable speedups, quantum algorithms usually come together with a plethora of caveats and constraints that can impair their general applicability. In particular, the problem that spurred the present investigation is the following: the algorithm presented in [11] required that a quantum state encoding the stationary distribution is provided at the beginning of the algorithm. While the original paper showed how this assumption can be justified under certain assumptions on the nature of the learning algorithm, the question of other scenarios which provide means to generate such initial states efficiently remained open.


Boltzmann Encoded Adversarial Machines

arXiv.org Machine Learning

Restricted Boltzmann Machines (RBMs) are a class of generative neural network that are typically trained to maximize a log-likelihood objective function. We argue that likelihood-based training strategies may fail because the objective does not sufficiently penalize models that place a high probability in regions where the training data distribution has low probability. To overcome this problem, we introduce Boltzmann Encoded Adversarial Machines (BEAMs). A BEAM is an RBM trained against an adversary that uses the hidden layer activations of the RBM to discriminate between the training data and the probability distribution generated by the model. We present experiments demonstrating that BEAMs outperform RBMs and GANs on multiple benchmarks.


Spatio-Temporal Neural Networks for Space-Time Series Forecasting and Relations Discovery

arXiv.org Machine Learning

We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for forecasting time series of spatial processes, i.e. series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of this model, corresponding to different prior hypothesis about the spatial relations between the series. The model is evaluated and compared to state-of-the-art baselines, on a variety of forecasting problems representative of different application areas: epidemiology, geo-spatial statistics and car-traffic prediction. Besides these evaluations, we also describe experiments showing the ability of this approach to extract relevant spatial relations.


Discovery of Driving Patterns by Trajectory Segmentation

arXiv.org Artificial Intelligence

Telematics data is becoming increasingly available due to the ubiquity of devices that collect data during drives, for different purposes, such as usage based insurance (UBI), fleet management, navigation of connected vehicles, etc. Consequently, a variety of data-analytic applications have become feasible that extract valuable insights from the data. In this paper, we address the especially challenging problem of discovering behavior-based driving patterns from only externally observable phenomena (e.g. vehicle's speed). We present a trajectory segmentation approach capable of discovering driving patterns as separate segments, based on the behavior of drivers. This segmentation approach includes a novel transformation of trajectories along with a dynamic programming approach for segmentation. We apply the segmentation approach on a real-word, rich dataset of personal car trajectories provided by a major insurance company based in Columbus, Ohio. Analysis and preliminary results show the applicability of approach for finding significant driving patterns.


Towards Symbolic Reinforcement Learning with Common Sense

arXiv.org Artificial Intelligence

Deep Reinforcement Learning (deep RL) has made several breakthroughs in recent years in applications ranging from complex control tasks in unmanned vehicles to game playing. Despite their success, deep RL still lacks several important capacities of human intelligence, such as transfer learning, abstraction and interpretability. Deep Symbolic Reinforcement Learning (DSRL) seeks to incorporate such capacities to deep Q-networks (DQN) by learning a relevant symbolic representation prior to using Q-learning. In this paper, we propose a novel extension of DSRL, which we call Symbolic Reinforcement Learning with Common Sense (SRL CS), offering a better balance between generalization and specialization, inspired by principles of common sense when assigning rewards and aggregating Q-values. Experiments reported in this paper show that SRL CS learns consistently faster than Q-learning and DSRL, achieving also a higher accuracy. In the hardest case, where agents were trained in a deterministic environment and tested in a random environment, SRL CS achieves nearly 100% average accuracy compared to DSRL's 70% and DQN's 50% accuracy. To the best of our knowledge, this is the first case of near perfect zero-shot transfer learning using Reinforcement Learning.


State-Space Abstractions for Probabilistic Inference: A Systematic Review

arXiv.org Artificial Intelligence

Tasks such as social network analysis, human behavior recognition, or modeling biochemical reactions, can be solved elegantly by using the probabilistic inference framework. However, standard probabilistic inference algorithms work at a propositional level, and thus cannot capture the symmetries and redundancies that are present in these tasks. Algorithms that exploit those symmetries have been devised in different research fields, for example by the lifted inference-, multiple object tracking-, and modeling and simulation-communities. The common idea, that we call state space abstraction, is to perform inference over compact representations of sets of symmetric states. Although they are concerned with a similar topic, the relationship between these approaches has not been investigated systematically. This survey provides the following contributions. We perform a systematic literature review to outline the state of the art in probabilistic inference methods exploiting symmetries. From an initial set of more than 4,000 papers, we identify 116 relevant papers. Furthermore, we provide new high-level categories that classify the approaches, based on the problem classes the different approaches can solve. Researchers from different fields that are confronted with a state space explosion problem in a probabilistic system can use this classification to identify possible solutions. Finally, based on this conceptualization, we identify potentials for future research, as some relevant application domains are not addressed by current approaches.


Entropy production rate as a criterion for inconsistency in decision theory

arXiv.org Artificial Intelligence

Individual and group decisions are complex, often involving choosing an apt alternative from a multitude of options. Evaluating pairwise comparisons breaks down such complex decision problems into tractable ones. Pairwise comparison matrices (PCMs) are regularly used to solve multiple-criteria decision-making (MCDM) problems, for example, using Saaty's analytic hierarchy process (AHP) framework. However, there are two significant drawbacks of using PCMs. First, humans evaluate PCMs in an inconsistent manner. Second, not all entries of a large PCM can be reliably filled by human decision makers. We address these two issues by first establishing a novel connection between PCMs and time-irreversible Markov processes. Specifically, we show that every PCM induces a family of dissipative maximum path entropy random walks (MERW) over the set of alternatives. We show that only `consistent' PCMs correspond to detailed balanced MERWs. We identify the non-equilibrium entropy production in the induced MERWs as a metric of inconsistency of the underlying PCMs. Notably, the entropy production satisfies all of the recently laid out criteria for reasonable consistency indices. We also propose an approach to use incompletely filled PCMs in AHP. Potential future avenues are discussed as well. keywords: analytic hierarchy process, markov chains, maximum entropy


Deep Learning in Spiking Neural Networks

arXiv.org Artificial Intelligence

Deep learning approaches have shown remarkable performance in many areas of pattern recognition recently. In spite of their power in hierarchical feature extraction and classification, this type of neural network is computationally expensive and difficult to implement on hardware for portable devices. In an other vein of research on neural network architectures, spiking neural networks (SNNs) have been described as power-efficient models because of their sparse, spike-based communication framework. SNNs are brain-inspired such that they seek to mimic the accurate and efficient functionality of the brain. Recent studies try to take advantages of the both frameworks (deep learning and SNNs) to develop a deep architecture of SNNs to achieve high performance of recently proved deep networks while implementing bio-inspired, power-efficient platforms. Additionally, As the brain process different stimuli patterns through multi-layer SNNs that are communicating by spike trains via adaptive synapses, developing artificial deep SNNs can also be very helpful for understudying the computations done by biological neural circuits. Having both computational and experimental backgrounds, we are interested in including a comprehensive summary of recent advances in developing deep SNNs that may assist computer scientists interested in developing more advanced and efficient networks and help experimentalists to frame new hypotheses for neural information processing in the brain using a more realistic model.


MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

arXiv.org Machine Learning

One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers during the training iterations. Gradient quantization has been proposed as an effective approach to reducing the communication volume. One key issue in gradient quantization is setting the number of bits for quantizing the gradients. Small number of bits can significantly reduce the communication overhead while hurts the gradient accuracies, and vise versa. An ideal quantization method would dynamically balance the communication overhead and model accuracy, through adjusting the number bits according to the knowledge learned from the immediate past training iterations. Existing methods, however, quantize the gradients either with fixed number of bits, or with predefined heuristic rules. In this paper we propose a novel adaptive quantization method within the framework of reinforcement learning. The method, referred to as MQGrad, formalizes the selection of quantization bits as actions in a Markov decision process (MDP) where the MDP states records the information collected from the past optimization iterations (e.g., the sequence of the loss function values). During the training iterations of a machine learning algorithm, MQGrad continuously updates the MDP state according to the changes of the loss function. Based on the information, MDP learns to select the optimal actions (number of bits) to quantize the gradients. Experimental results based on a benchmark dataset showed that MQGrad can accelerate the learning of a large scale deep neural network while keeping its prediction accuracies.


Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation

arXiv.org Artificial Intelligence

The encoder-decoder dialog model is one of the most prominent methods used to build dialog systems in complex domains. Yet it is limited because it cannot output interpretable actions as in traditional systems, which hinders humans from understanding its generation process. We present an unsupervised discrete sentence representation learning method that can integrate with any existing encoder-decoder dialog models for interpretable response generation. Building upon variational autoencoders (VAEs), we present two novel models, DI-VAE and DI-VST that improve VAEs and can discover interpretable semantics via either auto encoding or context predicting. Our methods have been validated on real-world dialog datasets to discover semantic representations and enhance encoder-decoder models with interpretable generation.