AITopics

1809.09147

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.69)
Leisure & Entertainment > Games > Computer Games (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Yurochkin, Mikhail, Fan, Zhiwei, Guha, Aritra, Koutris, Paraschos, Nguyen, XuanLong

Streaming dynamic and distributed inference of latent geometric structures

arXiv.org Machine LearningSep-23-2018

The topic or population polytope (Nguyen, 2015; Tang et al., 2014) is a fundamental geometric object that underlies the presence of latent topic variables in topic and admixture models (Blei et al., 2003; Pritchard et al., 2000). When data and the associated topics are indexed by time dimension, it is of interest to study the temporal dynamics of such latent geometric structures. In this paper, we will study the modeling and algorithms for learning the temporal dynamics of topic polytope that arises in the analysis of text corpora. The convex geometry of topic models provides the theoretical basis for posterior contraction analysis of latent topics (Nguyen, 2015; Tang et al., 2014). Furthermore, Yurochkin & Nguyen (2016); Yurochkin et al. (2017) exploited convex geometry to develop fast and quite accurate inference algorithms in a number of parametric and nonparametric settings.

data mining, machine learning, natural language, (18 more...)

1809.08738

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

arXiv.org Artificial IntelligenceSep-22-2018

Medical Knowledge Embedding Based on Recursive Neural Network for Multi-Disease Diagnosis

Jiang, Jingchi, Wang, Huanzheng, Xie, Jing, Guo, Xitong, Guan, Yi, Yu, Qiubin

The representation of knowledge based on first-order logic captures the richness of natural language and supports multiple probabilistic inference models. Although symbolic representation enables quantitative reasoning with statistical probability, it is difficult to utilize with machine learning models as they perform numerical operations. In contrast, knowledge embedding (i.e., high-dimensional and continuous vectors) is a feasible approach to complex reasoning that can not only retain the semantic information of knowledge but also establish the quantifiable relationship among them. In this paper, we propose recursive neural knowledge network (RNKN), which combines medical knowledge based on first-order logic with recursive neural network for multi-disease diagnosis. After RNKN is efficiently trained from manually annotated Chinese Electronic Medical Records (CEMRs), diagnosis-oriented knowledge embeddings and weight matrixes are learned. Experimental results verify that the diagnostic accuracy of RNKN is superior to that of some classical machine learning models and Markov logic network (MLN). The results also demonstrate that the more explicit the evidence extracted from CEMRs is, the better is the performance achieved. RNKN gradually exhibits the interpretation of knowledge embeddings as the number of training epochs increases.

artificial intelligence, machine learning, natural language, (21 more...)

1809.08422

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

arXiv.org Artificial IntelligenceSep-21-2018

Finite Sample Analysis of the GTD Policy Evaluation Algorithms in Markov Setting

Wang, Yue, Chen, Wei, Liu, Yuting, Ma, Zhi-Ming, Liu, Tie-Yan

In reinforcement learning (RL) , one of the key components is policy evaluation, which aims to estimate the value function (i.e., expected long-term accumulated reward) of a policy. With a good policy evaluation method, the RL algorithms will estimate the value function more accurately and find a better policy. When the state space is large or continuous \emph{Gradient-based Temporal Difference(GTD)} policy evaluation algorithms with linear function approximation are widely used. Considering that the collection of the evaluation data is both time and reward consuming, a clear understanding of the finite sample performance of the policy evaluation algorithms is very important to reinforcement learning. Under the assumption that data are i.i.d. generated, previous work provided the finite sample analysis of the GTD algorithms with constant step size by converting them into convex-concave saddle point problems. However, it is well-known that, the data are generated from Markov processes rather than i.i.d. in RL problems.. In this paper, in the realistic Markov setting, we derive the finite sample bounds for the general convex-concave saddle point problems, and hence for the GTD algorithms. We have the following discussions based on our bounds. (1) With variants of step size, GTD algorithms converge. (2) The convergence rate is determined by the step size, with the mixing time of the Markov process as the coefficient. The faster the Markov processes mix, the faster the convergence. (3) We explain that the experience replay trick is effective by improving the mixing property of the Markov process. To the best of our knowledge, our analysis is the first to provide finite sample bounds for the GTD algorithms in Markov setting.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1809.08926

Country:

Asia (0.46)
North America (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Hasanbeig, Mohammadhosein, Abate, Alessandro, Kroening, Daniel

Logically-Constrained Neural Fitted Q-Iteration

arXiv.org Machine LearningSep-20-2018

This paper proposes a method for efficient training of the Q-function for continuous-state Markov Decision Processes (MDP), such that the traces of the resulting policies satisfy a Linear Temporal Logic (LTL) property. The logical property is converted into a limit deterministic Buchi automaton with which a product MDP is constructed. The control policy is then synthesized by a reinforcement learning algorithm assuming that no prior knowledge is available from the MDP. The proposed method is evaluated in a numerical study to test the quality of the generated control policy and is compared against conventional methods for policy synthesis such as MDP abstraction (Voronoi quantizer) and approximate dynamic programming (fitted value iteration).

algorithm, mdp, state space, (14 more...)

1809.07823

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Arizona (0.04)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Davis, Trevor, Waugh, Kevin, Bowling, Michael

Solving Large Extensive-Form Games with Strategy Constraints

arXiv.org Artificial IntelligenceSep-20-2018

Extensive-form games are a common model for multiagent interactions with imperfect information. In two-player zero-sum games, the typical solution concept is a Nash equilibrium over the unconstrained strategy set for each player. In many situations, however, we would like to constrain the set of possible strategies. For example, constraints are a natural way to model limited resources, risk mitigation, safety, consistency with past observations of behavior, or other secondary objectives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; however, state-of-the-art algorithms for solving large games cannot handle general constraints. In this work we introduce a generalized form of Counterfactual Regret Minimization that provably finds optimal strategies under any feasible set of convex constraints. We demonstrate the effectiveness of our algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given only partial observations of private information.

artificial intelligence, constraint, machine learning, (18 more...)

1809.07893

Country: North America (0.28)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Butz, Martin V., Bilkey, David, Humaidan, Dania, Knott, Alistair, Otte, Sebastian

Learning, Planning, and Control in a Monolithic Neural Event Inference Architecture

arXiv.org Artificial IntelligenceSep-19-2018

We introduce a dynamic artificial neural network-based (ANN) adaptive inference process, which learns temporal predictive models of dynamical systems. We term the process REPRISE, a REtrospective and PRospective Inference SchEme. REPRISE infers the unobservable contextual state that best explains its recently encountered sensorimotor experiences as well as accompanying, context-dependent temporal predictive models retrospectively. Meanwhile, it executes prospective inference, optimizing upcoming motor activities in a goal-directed manner. In a first implementation, a recurrent neural network (RNN) is trained to learn a temporal forward model, which predicts the sensorimotor contingencies of different simulated dynamic vehicles. The RNN is augmented with contextual neurons, which enable the compact encoding of distinct, but related sensorimotor dynamics. We show that REPRISE is able to concurrently learn to separate and approximate the encountered sensorimotor dynamics. Moreover, we show that REPRISE can exploit the learned model to induce goal-directed, model-predictive control, that is, approximate active inference: Given a goal state, the system imagines a motor command sequence optimizing it with the prospective objective to minimize the distance to a given goal. Meanwhile, the system evaluates the encountered sensorimotor contingencies retrospectively, adapting its neural hidden states for maintaining model coherence. The RNN activities thus continuously imagine the upcoming future and reflect on the recent past, optimizing both, hidden state and motor activities. In conclusion, the combination of temporal predictive structures with modulatory, generative encodings offers a way to develop compact event codes, which selectively activate particular types of sensorimotor event-specific dynamics.

deep learning, upstream oil & gas, vehicle, (23 more...)

1809.07412

Country:

Europe > Germany (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
Oceania > New Zealand (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.46)
Leisure & Entertainment > Games (0.46)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Dornheim, Johannes, Link, Norbert, Gumbsch, Peter

Model-Free Adaptive Optimal Control of Sequential Manufacturing Processes using Reinforcement Learning

arXiv.org Artificial IntelligenceSep-18-2018

A self-learning optimal control algorithm for sequential manufacturing processes with time-discrete control actions is proposed and evaluated with simulated deep drawing processes. The necessary control model is built during consecutive process executions under optimal control via Reinforcement Learning, using the measured product quality as reward after each process execution. Prior model formation, which is required by state-of-the-art algorithms like Model Predictive Control and Approximate Dynamic Programming, is therefore obsolete. This avoids the difficulties in system identification and accurate modelling, which arise with processes subject to non-linear dynamics and stochastic influences. Also runtime complexity problems of these approaches are avoided, which arise when more complex models and larger control prediction horizons are employed. Instead of using pre-created process- and observation-models, Reinforcement Learning algorithms build functions of expected future reward during processing, which are then used for optimal process control decisions. The learning of such expectation functions is realized online by interacting with the process. The proposed algorithm also takes stochastic variations of the process conditions into consideration and is able to cope with partial observability. A method for the adaptive optimal control of partially observable fixed-horizon manufacturing processes, based on Q-learning is developed and studied. The resulting algorithm is instantiated and then evaluated by application to a time-stochastic optimal control problem in metal sheet deep drawing, where the experiments use FEM-simulated processes. The Reinforcement Learning based control shows superior results over the model-based Model Predictive Control and Approximate Dynamic Programming approaches.

deep learning, optimal control, upstream oil & gas, (18 more...)

1809.06646

Country: Europe > Germany (0.28)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Vasishta, Pavan, Vaufreydaz, Dominique, Spalanzani, Anne

Building Prior Knowledge: A Markov Based Pedestrian Prediction Model Using Urban Environmental Data

arXiv.org Machine LearningSep-17-2018

Autonomous Vehicles navigating in urban areas have a need to understand and predict future pedestrian behavior for safer navigation. This high level of situational awareness requires observing pedestrian behavior and extrapolating their positions to know future positions. While some work has been done in this field using Hidden Markov Models (HMMs), one of the few observed drawbacks of the method is the need for informed priors for learning behavior. In this work, an extension to the Growing Hidden Markov Model (GHMM) method is proposed to solve some of these drawbacks. This is achieved by building on existing work using potential cost maps and the principle of Natural Vision. As a consequence, the proposed model is able to predict pedestrian positions more precisely over a longer horizon compared to the state of the art. The method is tested over "legal" and "illegal" behavior of pedestrians, having trained the model with sparse observations and partial trajectories. The method, with no training data, is compared against a trained state of the art model. It is observed that the proposed method is robust even in new, previously unseen areas.

artificial intelligence, machine learning, trajectory, (15 more...)

1809.06045

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
Europe > Switzerland (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation (0.69)
Government > Military (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Markovic, Romana, Frisch, Jérôme, van Treeck, Christoph

Learning short-term past as predictor of human behavior in commercial buildings

arXiv.org Machine LearningSep-17-2018

This paper addresses the question of identifying the time-window in short-term past from which the information regarding the future occupant's window opening actions and resulting window states in buildings can be predicted. The addressed sequence duration was in the range between 30 and 240 time-steps of indoor climate data, where the applied temporal discretization was one minute. For that purpose, a deep neural network is trained to predict the window states, where the input sequence duration is handled as an additional hyperparameter. Eventually, the relationship between the prediction accuracy and the time-lag of the predicted window state in future is analyzed. The results pointed out, that the optimal predictive performance was achieved for the case where 60 time-steps of the indoor climate data were used as input. Additionally, the results showed that very long sequences (120-240 time-steps) could be addressed efficiently, given the right hyperprameters. Hence, the use of the memory over previous hours of high-resolution indoor climate data did not improve the predictive performance, when compared to the case where 30/60 minutes indoor sequences were used. The analysis of the prediction accuracy in the form of F1 score for the different time-lag of future window states dropped from 0.51 to 0.27, when shifting the prediction target from 10 to 60 minutes in future.

artificial intelligence, machine learning, neural network, (19 more...)

1809.1002

Country:

Europe > Germany (0.28)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Energy (1.00)
Banking & Finance > Real Estate (0.50)
Construction & Engineering > HVAC (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)