AITopics

1811.01533

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Foerster, Jakob N., de Witt, Christian A. Schroeder, Farquhar, Gregory, Torr, Philip H. S., Boehmer, Wendelin, Whiteson, Shimon

Multi-Agent Common Knowledge Reinforcement Learning

arXiv.org Artificial IntelligenceNov-5-2018

In multi-agent reinforcement learning, centralised policies can only be executed if agents have access to either the global state or an instantaneous communication channel. An alternative approach that circumvents this limitation is to use centralised training of a set of decentralised policies. However, such policies severely limit the agents' ability to coordinate. We propose multi-agent common knowledge reinforcement learning (MACKRL), which strikes a middle ground between these two extremes. Our approach is based on the insight that, even in partially observable settings, subsets of agents often have some common knowledge that they can exploit to coordinate their behaviour. Common knowledge can arise, e.g., if all agents can reliably observe things in their own field of view and know the field of view of other agents. Using this additional information, it is possible to find a centralised policy that conditions only on agents' common knowledge and that can be executed in a decentralised fashion. A resulting challenge is then to determine at what level agents should coordinate. While the common knowledge shared among all agents may not contain much valuable information, there may be subgroups of agents that share common knowledge useful for coordination. MACKRL addresses this challenge using a hierarchical approach: at each level, a controller can either select a joint action for the agents in a given subgroup, or propose a partition of the agents into smaller subgroups whose actions are then selected by controllers at the next level. While action selection involves sampling hierarchically, learning updates are based on the probability of the joint action, calculated by marginalising across the possible decisions of the hierarchy. We show promising results on both a proof-of-concept matrix game and a multi-agent version of StarCraft II Micromanagement.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1810.11702

Country: Europe (0.46)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Machine LearningNov-3-2018

Variational Bayes Inference in Digital Receivers

Tran, Viet Hung

The digital telecommunications receiver is an important context for inference methodology, the key objective being to minimize the expected loss function in recovering the transmitted information. For that criterion, the optimal decision is the Bayesian minimum-risk estimator. However, the computational load of the Bayesian estimator is often prohibitive and, hence, efficient computational schemes are required. The design of novel schemes, striking new balances between accuracy and computational load, is the primary concern of this thesis. Two popular techniques, one exact and one approximate, will be studied. The exact scheme is a recursive one, namely the generalized distributive law (GDL), whose purpose is to distribute all operators across the conditionally independent (CI) factors of the joint model, so as to reduce the total number of operators required. In a novel theorem derived in this thesis, GDL, if applicable, will be shown to guarantee such a reduction in all cases. An associated lemma also quantifies this reduction. For practical use, two novel algorithms, namely the no-longer-needed (NLN) algorithm and the generalized form of the Markovian Forward-Backward (FB) algorithm, recursively factorizes and computes the CI factors of an arbitrary model, respectively. The approximate scheme is an iterative one, namely the Variational Bayes (VB) approximation, whose purpose is to find the independent (i.e. zero-order Markov) model closest to the true joint model in the minimum Kullback-Leibler divergence (KLD) sense. Despite being computationally efficient, this naive mean field approximation confers only modest performance for highly correlated models. A novel approximation, namely Transformed Variational Bayes (TVB), will be designed in the thesis in order to relax the zero-order constraint in the VB approximation, further reducing the KLD of the optimal approximation.

deterministic approximation, télécommunications, upstream oil & gas, (25 more...)

1811.02506

Country:

Asia (0.67)
Europe > Netherlands (0.45)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Telecommunications (1.00)
Information Technology (1.00)
Leisure & Entertainment (0.92)
(2 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Communications > Mobile (1.00)
(8 more...)

Hadfield-Menell, Dylan, Andrus, McKane, Hadfield, Gillian K.

Legible Normativity for AI Alignment: The Value of Silly Rules

arXiv.org Artificial IntelligenceNov-3-2018

It has become commonplace to assert that autonomous agents will have to be built to follow human rules of behavior-social norms and laws. But human laws and norms are complex and culturally varied systems; in many cases agents will have to learn the rules. This requires autonomous agents to have models of how human rule systems work so that they can make reliable predictions about rules. In this paper we contribute to the building of such models by analyzing an overlooked distinction between important rules and what we call silly rules --rules with no discernible direct impact on welfare. We show that silly rules render a normative system both more robust and more adaptable in response to shocks to perceived stability. They make normativity more legible for humans, and can increase legibility for AI systems as well. For AI systems to integrate into human normative systems, we suggest, it may be important for them to have models that include representations of silly rules.

agent, artificial intelligence, machine learning, (19 more...)

1811.01267

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningNov-2-2018

Effective Learning of Probabilistic Models for Clinical Predictions from Longitudinal Data

Yang, Shuo

Such information includes: the database in modern hospital systems, usually known as Electronic Health Records (EHR), which store the patients' diagnosis, medication, laboratory test results, medical image data, etc.; information on various health behaviors tracked and stored by wearable devices, ubiquitous sensors and mobile applications, such as the smoking status, alcoholism history, exercise level, sleeping conditions, etc.; information collected by census or various surveys regarding sociodemographic factors of the target cohort; and information on people's mental health inferred from their social media activities or social networks such as Twitter, Facebook, etc. These health-related data come from heterogeneous sources, describe assorted aspects of the individual's health conditions. Such data is rich in structure and information which has great research potentials for revealing unknown medical knowledge about genomic epidemiology, disease developments and correlations, drug discoveries, medical diagnosis, mental illness prevention, health behavior adaption, etc. In real-world problems, the number of features relating to a certain health condition could grow exponentially with the development of new information techniques for collecting and measuring data. To reveal the causal influence between various factors and a certain disease or to discover the correlations among diseases from data at such a tremendous scale, requires the assistance of advanced information technology such as data mining, machine learning, text mining, etc. Machine learning technology not only provides a way for learning qualitative relationships among features and patients, but also the quantitative parameters regarding the strength of such correlations.

data mining, logic & formal reasoning, machine learning, (20 more...)

1811.00749

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
(4 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(6 more...)

Madan, Dhiraj, Raghu, Dinesh, Pandey, Gaurav, Joshi, Sachindra

Unsupervised Learning of Interpretable Dialog Models

arXiv.org Artificial IntelligenceNov-2-2018

Recently several deep learning based models have been proposed for end-to-end learning of dialogs. While these models can be trained from data without the need for any additional annotations, it is hard to interpret them. On the other hand, there exist traditional state based dialog systems, where the states of the dialog are discrete and hence easy to interpret. However these states need to be handcrafted and annotated in the data. To achieve the best of both worlds, we propose Latent State Tracking Network (LSTN) using which we learn an interpretable model in unsupervised manner. The model defines a discrete latent variable at each turn of the conversation which can take a finite set of values. Since these discrete variables are not present in the training data, we use EM algorithm to train our model in unsupervised manner. In the experiments, we show that LSTN can help achieve interpretability in dialog models without much decrease in performance compared to end-to-end approaches.

artificial intelligence, machine learning, natural language, (15 more...)

1811.01012

Country: North America > United States (0.46)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Thodoroff, Pierre, Durand, Audrey, Pineau, Joelle, Precup, Doina

Temporal Regularization in Markov Decision Process

arXiv.org Machine LearningNov-1-2018

Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1811.00429

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Galagali, Nikhil, Xu-Wilson, Minnan

Patient Subtyping with Disease Progression and Irregular Observation Trajectories

arXiv.org Machine LearningNov-1-2018

Patient subtyping based on temporal observations can lead to significantly nuanced subtyping that acknowledges the dynamic characteristics of diseases. Existing methods for subtyping trajectories treat the evolution of clinical observations as a homogeneous process or employ data available at regular intervals. In reality, diseases may have transient underlying states and a state-dependent observation pattern. In our paper, we present an approach to subtype irregular patient data while acknowledging the underlying progression of disease states. Our approach consists of two components: a probabilistic model to determine the likelihood of a patient's observation trajectory and a mixture model to measure similarity between asynchronous patient trajectories. We demonstrate our model by discovering subtypes of progression to hemodynamic instability (requiring cardiovascular intervention) in a patient cohort from a multi-institution ICU dataset. We find three primary patterns: two of which show classic signs of decompensation (rising heart rate with dropping blood pressure), with one of these showing a faster course of decompensation than the other. The third pattern has transient period of low heart rate and blood pressure. We also show that our model results in a 13% reduction in average cross-entropy error compared to a model with no state progression when forecasting vital signs.

artificial intelligence, disease state, machine learning, (17 more...)

1810.09043

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)

Hoel, Carl-Johan, Wolff, Krister, Laine, Leo

Automated Speed and Lane Change Decision Making using Deep Reinforcement Learning

arXiv.org Artificial IntelligenceNov-1-2018

This paper introduces a method, based on deep reinforcement learning, for automatically generating a general purpose decision making function. A Deep Q-Network agent was trained in a simulated environment to handle speed and lane change decisions for a truck-trailer combination. In a highway driving case, it is shown that the method produced an agent that matched or surpassed the performance of a commonly used reference model. To demonstrate the generality of the method, the exact same algorithm was also tested by training it for an overtaking case on a road with oncoming traffic. Furthermore, a novel way of applying a convolutional neural network to high level input that represents interchangeable objects is also introduced.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1803.10056

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Kasnesis, Panagiotis, Patrikakis, Charalampos Z., Venieris, Iakovos S.

PerceptionNet: A Deep Convolutional Neural Network for Late Sensor Fusion

arXiv.org Machine LearningOct-31-2018

Abstract-- Human Activity Recognition (HAR) based on motion sensors has drawn a lot of attention over the last few years, since perceiving the human status enables context-aware applications to adapt their services on users' needs. However, motion sensor fusion and feature extraction have not reached their full potentials, remaining still an open issue. In this paper, we introduce PerceptionNet, a deep Convolutional Neural Network (CNN) that applies a late 2D convolution to multimodal time-series sensor data, in order to extract automatically efficient features for HAR. We evaluate our approach on two public available HAR datasets to demonstrate that the proposed model fuses effectively multimodal sensors and improves the performance of HAR. In particular, PerceptionNet surpasses the performance of state-of-the-art HAR methods based on: (i) features extracted from humans, (ii) deep CNNs exploiting early fusion approaches, and (iii) Long Short-Term Memory (LSTM), by an average accuracy of more than 3%. The proliferation of the Internet of Things (IoT) over the last few years, has contributed to the collection of huge amounts of time-series data. An IoT device with high sampling rates, such as a wearable, produces hundreds of data every second, resulting to a data explosion, considering the vast number of such devices connected over the internet. Through real-time or batch data processing, meaningful information is extracted, revealing daily patterns of individual owners or social groups.

artificial intelligence, machine learning, neural network, (13 more...)

1811.0017

Country: Europe > Greece (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Smart Houses & Appliances (0.88)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)