AITopics

1909.0595

Country: Asia (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Hausknecht, Matthew, Ammanabrolu, Prithviraj, Côté, Marc-Alexandre, Yuan, Xingdi

Interactive Fiction Games: A Colossal Adventure

arXiv.org Artificial IntelligenceSep-11-2019

A hallmark of human intelligence is the ability to understand and communicate with language. Interactive Fiction games are fully text-based simulation environments where a player issues text commands to effect change in the environment and progress through the story. We argue that IF games are an excellent testbed for studying language-based autonomous agents. In particular, IF games combine challenges of combinatorial action spaces, language understanding, and commonsense reasoning. To facilitate rapid development of language-based agents, we introduce Jericho, a learning environment for man-made IF games and conduct a comprehensive study of text-agents across a rich set of games, highlighting directions in which agents can improve.

machine learning, natural language, reinforcement learning, (18 more...)

1909.05398

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Alt, Bastian, Šošić, Adrian, Koeppl, Heinz

Correlation Priors for Reinforcement Learning

arXiv.org Artificial IntelligenceSep-11-2019

Many decision-making problems naturally exhibit pronounced structures inherited from the underlying characteristics of the environment. In a Markov decision process model, for example, two distinct states can have inherently related semantics or encode resembling physical state configurations, often implying locally correlated transition dynamics among the states. In order to complete a certain task, an agent acting in such environments needs to execute a series of temporally and spatially correlated actions. Though there exists a variety of approaches to account for correlations in continuous state-action domains, a principled solution for discrete environments is missing. In this work, we present a Bayesian learning framework based on P\'olya-Gamma augmentation that enables an analogous reasoning in such cases. We demonstrate the framework on a number of common decision-making related tasks, such as reinforcement learning, imitation learning and system identification. By explicitly modeling the underlying correlation structures, the proposed approach yields superior predictive performance compared to correlation-agnostic models, even when trained on data sets that are up to an order of magnitude smaller in size.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

1909.05106

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

#artificialintelligenceSep-10-2019, 19:42:03 GMT

Conditional Random Fields Explained

Conditional Random Fields is a class of discriminative models best suited to prediction tasks where contextual information or state of the neighbors affect the current prediction. CRFs find their applications in named entity recognition, part of speech tagging, gene prediction, noise reduction and object detection problems, to name a few. In this article, I will first introduce the basic math and jargon related to Markov Random Fields which is an abstraction CRF is built upon. I will then introduce and explain a simple Conditional Random Fields model in detail which will show why are they suited well to sequential prediction problems. After that, I will go over the likelihood maximization problem and related derivations in context of that CRF model.

artificial intelligence, machine learning, natural language, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.38)

Pandey, Venktesh, Wang, Evana, Boyles, Stephen D.

Deep Reinforcement Learning Algorithm for Dynamic Pricing of Express Lanes with Multiple Access Locations

arXiv.org Artificial IntelligenceSep-10-2019

This article develops a deep reinforcement learning (Deep-RL) framework for dynamic pricing on managed lanes with multiple access locations and heterogeneity in travelers' value of time, origin, and destination. This framework relaxes assumptions in the literature by considering multiple origins and destinations, multiple access locations to the managed lane, en route diversion of travelers, partial observability of the sensor readings, and stochastic demand and observations. The problem is formulated as a partially observable Markov decision process (POMDP) and policy gradient methods are used to determine tolls as a function of real-time observations. Tolls are modeled as continuous and stochastic variables, and are determined using a feedforward neural network. The method is compared against a feedback control method used for dynamic pricing. We show that Deep-RL is effective in learning toll policies for maximizing revenue, minimizing total system travel time, and other joint weighted objectives, when tested on real-world transportation networks. The Deep-RL toll policies outperform the feedback control heuristic for the revenue maximization objective by generating revenues up to 9.5% higher than the heuristic and for the objective minimizing total system travel time (TSTT) by generating TSTT up to 10.4% lower than the heuristic. We also propose reward shaping methods for the POMDP to overcome the undesired behavior of toll policies, like the jam-and-harvest behavior of revenue-maximizing policies. Additionally, we test transferability of the algorithm trained on one set of inputs for new input distributions and offer recommendations on real-time implementations of Deep-RL algorithms. The source code for our experiments is available online at https://github.com/venktesh22/ExpressLanes_Deep-RL

algorithm, ground transportation, survey article, (18 more...)

1909.0476

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (0.68)
Energy > Oil & Gas > Upstream (0.46)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Gagne, David John II, Christensen, Hannah M., Subramanian, Aneesh C., Monahan, Adam H.

Machine Learning for Stochastic Parameterization: Generative Adversarial Networks in the Lorenz '96 Model

Stochastic parameterizations account for uncertainty in the representation of unresolved sub-grid processes by sampling from the distribution of possible sub-grid forcings. Some existing stochastic parameterizations utilize data-driven approaches to characterize uncertainty, but these approaches require significant structural assumptions that can limit their scalability. Machine learning models, including neural networks, are able to represent a wide range of distributions and build optimized mappings between a large number of inputs and sub-grid forcings. Recent research on machine learning parameterizations has focused only on deterministic parameterizations. In this study, we develop a stochastic parameterization using the generative adversarial network (GAN) machine learning framework. The GAN stochastic parameterization is trained and evaluated on output from the Lorenz '96 model, which is a common baseline model for evaluating both parameterization and data assimilation techniques. We evaluate different ways of characterizing the input noise for the model and perform model runs with the GAN parameterization at weather and climate timescales. Some of the GAN configurations perform better than a baseline bespoke parameterization at both timescales, and the networks closely reproduce the spatio-temporal correlations and regimes of the Lorenz '96 system. We also find that in general those models which produce skillful forecasts are also associated with the best climate simulations.

gan, modeling earth system, parameterization, (16 more...)

1909.04711

Country:

North America > United States > Colorado > Boulder County > Boulder (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Southern Ocean (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Ghanavi, Rozhina, Sabbaghian, Maryam, Yanikomeroglu, Halim

Q-Learning Based Aerial Base Station Placement for Fairness Enhancement in Mobile Networks

In this paper, we use an aerial base station (aerial-BS) to enhance fairness in a dynamic environment with user mobility. The problem of optimally placing the aerial-BS is a non-deterministic polynomial-time hard (NP-hard) problem. Moreover, the network topology is subject to continuous changes due to the user mobility. These issues intensify the quest to develop an adaptive and fast algorithm for 3D placement of the aerial-BS. To this end, we propose a method based on reinforcement learning to achieve these goals. Simulation results show that our method increases fairness among users in a reasonable computing time, while the solution is comparatively close to the optimal solution obtained by exhaustive search.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1909.08093

Country:

North America > Canada (0.28)
Asia > Middle East > Iran (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Static Analysis for Probabilistic Programs

Bernstein, Ryan

Probabilistic programming is a powerful abstraction for statistical machine learning. Applying static analysis methods to probabilistic programs could serve to optimize the learning process, automatically verify properties of models, and improve the programming interface for users. This field of static analysis for probabilistic programming (SAPP) is young and unorganized, consisting of a constellation of techniques with various goals and limitations. The primary aim of this work is to synthesize the major contributions of the SAPP field within an organizing structure and context. We provide technical background for static analysis and probabilistic programming, suggest a functional taxonomy for probabilistic programming languages, and analyze the applicability of major ideas in the SAPP field. We conclude that, while current static analysis techniques for probabilistic programs have practical limitations, there are a number of future directions with high potential to improve the state of statistical machine learning.

logic & formal reasoning, machine learning, programming language, (20 more...)

1909.05076

Genre: Research Report (0.50)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(2 more...)

Boltzmann machine learning and regularization methods for inferring evolutionary fields and couplings from a multiple sequence alignment

Miyazawa, Sanzo

The inverse Potts problem to infer the Boltzmann distribution for homologous protein sequences from their single-site and pairwise frequencies recently attracts a great deal of attention due to its capacity to accurately predict residue-residue contacts in a 3D protein complex. A Boltzmann machine for the accurate estimation of the field and coupling interactions, which is required for other studies in protein evolution and folding, is studied about learning methods, regularization models and a tuning method of regularization parameters in order to infer the interactions with reasonable characteristics. Using $L_2$ regularization for fields, group $L_1$ for couplings is shown to be very effective for parse couplings in comparison with $L_2$ and with $L_1$. Two regularization parameters for fields and couplings are tuned to yield equal values for both the sample average and the ensemble average of evolutionary energies of natural proteins. Both the averages along a learning process smoothly change and converge, but their profiles are very different between the learning methods. Most per-parameter adaptive learning methods invented for machine learning cannot learn reasonable parameters for sparse-interaction systems. A modified Adam (ModAdam) method is invented to make step-size proportional to the partial derivative for sparse couplings and to use a soft thresholding function for group $L_1$. It is shown by first inferring interactions from protein sequences and then from Monte Carlo samples that the fields and couplings can be well recovered by the group $L_1$ and the ModAdam method. However, the distribution of evolutionary energies over natural proteins is shifted towards lower energies from that of Monte Carlo samples, indicating that there may be higher-order interactions to favor natural sequences.

artificial intelligence, machine learning, sequence, (15 more...)

1909.05006

Genre: Research Report (0.81)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)

Jo, Junghyo, Hoang, Danh-Tai, Periwal, Vipul

Inverse Ising inference from high-temperature re-weighting of observations

Maximum Likelihood Estimation (MLE) is the bread and butter of system inference for stochastic systems. In some generality, MLE will converge to the correct model in the infinite data limit. In the context of physical approaches to system inference, such as Boltzmann machines, MLE requires the arduous computation of partition functions summing over all configurations, both observed and unobserved. We present here a conceptually and computationally transparent data-driven approach to system inference that is based on the simple question: How should the Boltzmann weights of observed configurations be modified to make the probability distribution of observed configurations close to a flat distribution? This algorithm gives accurate inference by using only observed configurations for systems with a large number of degrees of freedom where other approaches are intractable.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1909.04305

Country:

North America > United States (0.70)
Asia > Vietnam > Quảng Bình Province (0.14)

Genre: Research Report (0.83)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)