AITopics

1503.01334

Country:

Europe (0.46)
North America > United States (0.46)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.43)

Fisher, Charles K., Smith, Aaron M., Walsh, Jonathan R.

Boltzmann Encoded Adversarial Machines

arXiv.org Machine LearningApr-23-2018

Restricted Boltzmann Machines (RBMs) are a class of generative neural network that are typically trained to maximize a log-likelihood objective function. We argue that likelihood-based training strategies may fail because the objective does not sufficiently penalize models that place a high probability in regions where the training data distribution has low probability. To overcome this problem, we introduce Boltzmann Encoded Adversarial Machines (BEAMs). A BEAM is an RBM trained against an adversary that uses the hidden layer activations of the RBM to discriminate between the training data and the probability distribution generated by the model. We present experiments demonstrating that BEAMs outperform RBMs and GANs on multiple benchmarks.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

1804.08682

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Ziat, Ali, Delasalles, Edouard, Denoyer, Ludovic, Gallinari, Patrick

Spatio-Temporal Neural Networks for Space-Time Series Forecasting and Relations Discovery

arXiv.org Machine LearningApr-23-2018

We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for forecasting time series of spatial processes, i.e. series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of this model, corresponding to different prior hypothesis about the spatial relations between the series. The model is evaluated and compared to state-of-the-art baselines, on a variety of forecasting problems representative of different application areas: epidemiology, geo-spatial statistics and car-traffic prediction. Besides these evaluations, we also describe experiments showing the ability of this approach to extract relevant spatial relations.

data mining, machine learning, relation, (19 more...)

arXiv.org Machine Learning

doi: 10.1109/ICDM.2017.80

1804.08562

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.70)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Moosavi, Sobhan, Nandi, Arnab, Ramnath, Rajiv

Discovery of Driving Patterns by Trajectory Segmentation

Telematics data is becoming increasingly available due to the ubiquity of devices that collect data during drives, for different purposes, such as usage based insurance (UBI), fleet management, navigation of connected vehicles, etc. Consequently, a variety of data-analytic applications have become feasible that extract valuable insights from the data. In this paper, we address the especially challenging problem of discovering behavior-based driving patterns from only externally observable phenomena (e.g. vehicle's speed). We present a trajectory segmentation approach capable of discovering driving patterns as separate segments, based on the behavior of drivers. This segmentation approach includes a novel transformation of trajectories along with a dynamic programming approach for segmentation. We apply the segmentation approach on a real-word, rich dataset of personal car trajectories provided by a major insurance company based in Columbus, Ohio. Analysis and preliminary results show the applicability of approach for finding significant driving patterns.

artificial intelligence, machine learning, trajectory, (15 more...)

1804.08748

Country: North America > United States > Ohio > Franklin County > Columbus (0.24)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Garcez, Artur d'Avila, Dutra, Aimore Resende Riquetti, Alonso, Eduardo

Towards Symbolic Reinforcement Learning with Common Sense

Deep Reinforcement Learning (deep RL) has made several breakthroughs in recent years in applications ranging from complex control tasks in unmanned vehicles to game playing. Despite their success, deep RL still lacks several important capacities of human intelligence, such as transfer learning, abstraction and interpretability. Deep Symbolic Reinforcement Learning (DSRL) seeks to incorporate such capacities to deep Q-networks (DQN) by learning a relevant symbolic representation prior to using Q-learning. In this paper, we propose a novel extension of DSRL, which we call Symbolic Reinforcement Learning with Common Sense (SRL CS), offering a better balance between generalization and specialization, inspired by principles of common sense when assigning rewards and aggregating Q-values. Experiments reported in this paper show that SRL CS learns consistently faster than Q-learning and DSRL, achieving also a higher accuracy. In the hardest case, where agents were trained in a deterministic environment and tested in a random environment, SRL CS achieves nearly 100% average accuracy compared to DSRL's 70% and DQN's 50% accuracy. To the best of our knowledge, this is the first case of near perfect zero-shot transfer learning using Reinforcement Learning.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1804.08597

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Lüdtke, Stefan, Schröder, Max, Krüger, Frank, Bader, Sebastian, Kirste, Thomas

State-Space Abstractions for Probabilistic Inference: A Systematic Review

Tasks such as social network analysis, human behavior recognition, or modeling biochemical reactions, can be solved elegantly by using the probabilistic inference framework. However, standard probabilistic inference algorithms work at a propositional level, and thus cannot capture the symmetries and redundancies that are present in these tasks. Algorithms that exploit those symmetries have been devised in different research fields, for example by the lifted inference-, multiple object tracking-, and modeling and simulation-communities. The common idea, that we call state space abstraction, is to perform inference over compact representations of sets of symmetric states. Although they are concerned with a similar topic, the relationship between these approaches has not been investigated systematically. This survey provides the following contributions. We perform a systematic literature review to outline the state of the art in probabilistic inference methods exploiting symmetries. From an initial set of more than 4,000 papers, we identify 116 relevant papers. Furthermore, we provide new high-level categories that classify the approaches, based on the problem classes the different approaches can solve. Researchers from different fields that are confronted with a state space explosion problem in a probabilistic system can use this classification to identify possible solutions. Finally, based on this conceptualization, we identify potentials for future research, as some relevant application domains are not addressed by current approaches.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1804.06748

Country:

Europe (1.00)
North America > United States (0.45)
North America > Canada (0.28)

Genre:

Research Report (1.00)
Overview (1.00)
Workflow (0.92)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Entropy production rate as a criterion for inconsistency in decision theory

Dixit, Purushottam D.

Individual and group decisions are complex, often involving choosing an apt alternative from a multitude of options. Evaluating pairwise comparisons breaks down such complex decision problems into tractable ones. Pairwise comparison matrices (PCMs) are regularly used to solve multiple-criteria decision-making (MCDM) problems, for example, using Saaty's analytic hierarchy process (AHP) framework. However, there are two significant drawbacks of using PCMs. First, humans evaluate PCMs in an inconsistent manner. Second, not all entries of a large PCM can be reliably filled by human decision makers. We address these two issues by first establishing a novel connection between PCMs and time-irreversible Markov processes. Specifically, we show that every PCM induces a family of dissipative maximum path entropy random walks (MERW) over the set of alternatives. We show that only `consistent' PCMs correspond to detailed balanced MERWs. We identify the non-equilibrium entropy production in the induced MERWs as a metric of inconsistency of the underlying PCMs. Notably, the entropy production satisfies all of the recently laid out criteria for reasonable consistency indices. We also propose an approach to use incompletely filled PCMs in AHP. Potential future avenues are discussed as well. keywords: analytic hierarchy process, markov chains, maximum entropy

artificial intelligence, machine learning, pcm, (17 more...)

1801.01733

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Tennis (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.56)

Tavanaei, Amirhossein, Ghodrati, Masoud, Kheradpisheh, Saeed Reza, Masquelier, Timothee, Maida, Anthony S.

Deep Learning in Spiking Neural Networks

arXiv.org Artificial IntelligenceApr-22-2018

Deep learning approaches have shown remarkable performance in many areas of pattern recognition recently. In spite of their power in hierarchical feature extraction and classification, this type of neural network is computationally expensive and difficult to implement on hardware for portable devices. In an other vein of research on neural network architectures, spiking neural networks (SNNs) have been described as power-efficient models because of their sparse, spike-based communication framework. SNNs are brain-inspired such that they seek to mimic the accurate and efficient functionality of the brain. Recent studies try to take advantages of the both frameworks (deep learning and SNNs) to develop a deep architecture of SNNs to achieve high performance of recently proved deep networks while implementing bio-inspired, power-efficient platforms. Additionally, As the brain process different stimuli patterns through multi-layer SNNs that are communicating by spike trains via adaptive synapses, developing artificial deep SNNs can also be very helpful for understudying the computations done by biological neural circuits. Having both computational and experimental backgrounds, we are interested in including a comprehensive summary of recent advances in developing deep SNNs that may assist computer scientists interested in developing more advanced and efficient networks and help experimentalists to frame new hypotheses for neural information processing in the brain using a more realistic model.

artificial intelligence, machine learning, neural network, (16 more...)

1804.0815

Country: North America > United States (0.93)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

arXiv.org Machine LearningApr-22-2018

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Cui, Guoxin, Xu, Jun, Zeng, Wei, Lan, Yanyan, Guo, Jiafeng, Cheng, Xueqi

One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers during the training iterations. Gradient quantization has been proposed as an effective approach to reducing the communication volume. One key issue in gradient quantization is setting the number of bits for quantizing the gradients. Small number of bits can significantly reduce the communication overhead while hurts the gradient accuracies, and vise versa. An ideal quantization method would dynamically balance the communication overhead and model accuracy, through adjusting the number bits according to the knowledge learned from the immediate past training iterations. Existing methods, however, quantize the gradients either with fixed number of bits, or with predefined heuristic rules. In this paper we propose a novel adaptive quantization method within the framework of reinforcement learning. The method, referred to as MQGrad, formalizes the selection of quantization bits as actions in a Markov decision process (MDP) where the MDP states records the information collected from the past optimization iterations (e.g., the sequence of the loss function values). During the training iterations of a machine learning algorithm, MQGrad continuously updates the MDP state according to the changes of the loss function. Based on the information, MDP learns to select the optimal actions (number of bits) to quantize the gradients. Experimental results based on a benchmark dataset showed that MQGrad can accelerate the learning of a large scale deep neural network while keeping its prediction accuracies.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1804.08066

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Zhao, Tiancheng, Lee, Kyusong, Eskenazi, Maxine

Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation

arXiv.org Artificial IntelligenceApr-22-2018

The encoder-decoder dialog model is one of the most prominent methods used to build dialog systems in complex domains. Yet it is limited because it cannot output interpretable actions as in traditional systems, which hinders humans from understanding its generation process. We present an unsupervised discrete sentence representation learning method that can integrate with any existing encoder-decoder dialog models for interpretable response generation. Building upon variational autoencoders (VAEs), we present two novel models, DI-VAE and DI-VST that improve VAEs and can discover interpretable semantics via either auto encoding or context predicting. Our methods have been validated on real-world dialog datasets to discover semantic representations and enhance encoder-decoder models with interpretable generation.

artificial intelligence, machine learning, natural language, (18 more...)

1804.08069

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)