AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models

Zhao, Tiancheng, Xie, Kaige, Eskenazi, Maxine

arXiv.org Artificial IntelligenceFeb-23-2019

Defining action spaces for conversational agents and optimizing their decision-making process with reinforcement learning is an enduring challenge. Common practice has been to use handcrafted dialog acts, or the output vocabulary, e.g. in neural encoder decoders, as the action spaces. Both have their own limitations. This paper proposes a novel latent action framework that treats the action spaces of an end-to-end dialog agent as latent variables and develops unsupervised methods in order to induce its own action space from the data. Comprehensive experiments are conducted examining both continuous and discrete action types and two different optimization methods based on stochastic variational inference. Results show that the proposed latent actions achieve superior empirical performance improvement over previous word-level policy gradient methods on both DealOrNoDeal and MultiWoz dialogs. Our detailed analysis also provides insights about various latent variable approaches for policy learning and can serve as a foundation for developing better latent actions in future research.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1902.08858

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Transfer Learning for Non-Intrusive Load Monitoring

DIncecco, Michele, Squartini, Stefano, Zhong, Mingjun

arXiv.org Machine LearningFeb-23-2019

Non-intrusive load monitoring (NILM) is a technique to recover source appliances from only the recorded mains in a household. NILM is unidentifiable and thus a challenge problem because the inferred power value of an appliance given only the mains could not be unique. To mitigate the unidentifiable problem, various methods incorporating domain knowledge into NILM have been proposed and shown effective experimentally. Recently, among these methods, deep neural networks are shown performing best. Arguably, the recently proposed sequence-to-point (seq2point) learning is promising for NILM. However, the results were only carried out on the same data domain. It is not clear if the method could be generalised or transferred to different domains, e.g., the test data were drawn from a different country comparing to the training data. We address this issue in the paper, and two transfer learning schemes are proposed, i.e., appliance transfer learning (ATL) and cross-domain transfer learning (CTL). For ATL, our results show that the latent features learnt by a `complex' appliance, e.g., washing machine, can be transferred to a `simple' appliance, e.g., kettle. For CTL, our conclusion is that the seq2point learning is transferable. Precisely, when the training and test data are in a similar domain, seq2point learning can be directly applied to the test data without fine tuning; when the training and test data are in different domains, seq2point learning needs fine tuning before applying to the test data. Interestingly, we show that only the fully connected layers need fine tuning for transfer learning.

appliance, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1902.08835

Country:

Europe > United Kingdom (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Power Industry (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Probabilistic Inference of Binary Markov Random Fields in Spiking Neural Networks through Mean-field Approximation

Zheng, Yajing, Yu, Zhaofei, Jia, Shanshan, Liu, Jian K., Huang, Tiejun, Tian, Yonghong

arXiv.org Machine LearningFeb-22-2019

Recent studies have suggested that the cognitive process of the human brain is realized as probabilistic inference and can be further modeled by probabilistic graphical models like Markov random fields. Nevertheless, it remains unclear how probabilistic inference can be implemented by a network of spiking neurons in the brain. Previous studies tried to relate the inference equation of binary Markov random fields to the dynamic equation of spiking neural networks through belief propagation algorithm and reparameterization, but they are valid only for Markov random fields with limited network structure. In this paper, we propose a spiking neural network model that can implement inference of arbitrary binary Markov random fields. Specifically, we design a spiking recurrent neural network and prove that its neuronal dynamics are mathematically equivalent to the inference process of Markov random fields by adopting mean-field theory.

inference, neural network, recurrent neural network, (13 more...)

arXiv.org Machine Learning

1902.08411

Country:

Europe > United Kingdom > England > Leicestershire > Leicester (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Bayesian optimisation under uncertain inputs

Oliveira, Rafael, Ott, Lionel, Ramos, Fabio

arXiv.org Machine LearningFeb-21-2019

Bayesian optimisation (BO) has been a successful approach to optimise functions which are expensive to evaluate and whose observations are noisy. Classical BO algorithms, however, do not account for errors about the location where observations are taken, which is a common issue in problems with physical components. In these cases, the estimation of the actual query location is also subject to uncertainty. In this context, we propose an upper confidence bound (UCB) algorithm for BO problems where both the outcome of a query and the true query location are uncertain. The algorithm employs a Gaussian process model that takes probability distributions as inputs. Theoretical results are provided for both the proposed algorithm and a conventional UCB approach within the uncertain-inputs setting. Finally, we evaluate each method's performance experimentally, comparing them to other input noise aware BO approaches on simulated scenarios involving synthetic and real data.

algorithm, kernel, noise, (15 more...)

arXiv.org Machine Learning

1902.07908

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
South America > Chile (0.04)
(8 more...)

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

World Discovery Models

Azar, Mohammad Gheshlaghi, Piot, Bilal, Pires, Bernardo Avila, Grill, Jean-Bastian, Altché, Florent, Munos, Rémi

arXiv.org Machine LearningFeb-21-2019

As humans we are driven by a strong desire for seeking novelty in our world. Also upon observing a novel pattern we are capable of refining our understanding of the world based on the new information---humans can discover their world. The outstanding ability of the human mind for discovery has led to many breakthroughs in science, art and technology. Here we investigate the possibility of building an agent capable of discovering its world using the modern AI technology. In particular we introduce NDIGO, Neural Differential Information Gain Optimisation, a self-supervised discovery model that aims at seeking new information to construct a global view of its world from partial and noisy observations. Our experiments on some controlled 2-D navigation tasks show that NDIGO outperforms state-of-the-art information-seeking methods in terms of the quality of the learned representation. The improvement in performance is particularly significant in the presence of white or structured noise where other information-seeking methods follow the noise instead of discovering their world.

agent, information, representation, (12 more...)

arXiv.org Machine Learning

1902.07685

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
(2 more...)

Add feedback

Hyperbolic Discounting and Learning over Multiple Horizons

Fedus, William, Gelada, Carles, Bengio, Yoshua, Bellemare, Marc G., Larochelle, Hugo

arXiv.org Machine LearningFeb-20-2019

Reinforcement learning (RL) typically defines a discount factor as part of the Markov Decision Process. The discount factor values future rewards by an exponential scheme that leads to theoretical convergence guarantees of the Bellman equation. However, evidence from psychology, economics and neuroscience suggests that humans and animals instead have hyperbolic time-preferences. In this work we revisit the fundamentals of discounting in RL and bridge this disconnect by implementing an RL agent that acts via hyperbolic discounting. We demonstrate that a simple approach approximates hyperbolic discount functions while still using familiar temporal-difference learning techniques in RL. Additionally, and independent of hyperbolic discounting, we make a surprising discovery that simultaneously learning value functions over multiple time-horizons is an effective auxiliary task which often improves over a strong value-based RL agent, Rainbow.

agent, discount function, discounting, (13 more...)

arXiv.org Machine Learning

1902.06865

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Sequential Learning over Implicit Feedback for Robust Large-Scale Recommender Systems

Burashnikova, Alexandra, Maximov, Yury, Amini, Massih-Reza

arXiv.org Machine LearningFeb-20-2019

In this paper, we propose a robust sequential learning strategy for training large-scale Recommender Systems (RS) over implicit feedback mainly in the form of clicks. Our approach relies on the minimization of a pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. Parameter updates are discarded if for a given user the number of sequential blocks is below or above some given thresholds estimated over the distribution of the number of blocks in the training set. This is to prevent from an abnormal number of clicks over some targeted items, mainly due to bots; or very few user interactions. Both scenarios affect the decision of RS and imply a shift over the distribution of items that are shown to the users. We provide a theoretical analysis showing that in the case where the ranking loss is convex, the deviation between the loss with respect to the sequence of weights found by the proposed algorithm and its minimum is bounded. Furthermore, experimental results on five large-scale collections demonstrate the efficiency of the proposed algorithm with respect to the state-of-the-art approaches, both regarding different ranking measures and computation time.

artificial intelligence, machine learning, recommendation, (18 more...)

arXiv.org Machine Learning

1902.08495

Country: North America > United States (0.93)

Genre: Research Report > Promising Solution (0.34)

Industry:

Energy (0.67)
Media (0.49)
Information Technology > Services (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Towards the Next Generation Airline Revenue Management: A Deep Reinforcement Learning Approach to Seat Inventory Control and Overbooking

Shihab, Syed Arbab Mohd, Logemann, Caleb, Thomas, Deepak-George, Wei, Peng

arXiv.org Artificial IntelligenceFeb-18-2019

Revenue management can enable airline corporations to maximize the revenue generated from each scheduled flight departing in their transportation network by means of finding the optimal policies for differential pricing, seat inventory control and overbooking. As different demand segments in the market have different Willingness-To-Pay (WTP), airlines use differential pricing, booking restrictions, and service amenities to determine different fare classes or products targeted at each of these demand segments. Because seats are limited for each flight, airlines also need to allocate seats for each of these fare classes to prevent lower fare class passengers from displacing higher fare class ones and set overbooking limits in anticipation of cancellations and no-shows such that revenue is maximized. Previous work addresses these problems using optimization techniques or classical Reinforcement Learning methods. This paper focuses on the latter problem - the seat inventory control problem - casting it as a Markov Decision Process to be able to find the optimal policy. Multiple fare classes, concurrent continuous arrival of passengers of different fare classes, overbooking and random cancellations that are independent of class have been considered in the model. We have addressed this problem using Deep Q-Learning with the goal of maximizing the reward for each flight departure. The implementation of this technique allows us to employ large continuous state space but also presents the potential opportunity to test on real time airline data. To generate data and train the agent, a basic air-travel market simulator was developed. The performance of the agent in different simulated market scenarios was compared against theoretically optimal solutions and was found to be nearly close to the expected optimal revenue.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1902.06824

Country:

North America > United States > Iowa > Story County > Ames (0.05)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.40)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

A semi-supervised deep residual network for mode detection in Wi-Fi signals

Kalatian, Arash, Farooq, Bilal

arXiv.org Machine LearningFeb-17-2019

Due to their ubiquitous and pervasive nature, Wi-Fi networks have the potential to collect large-scale, low-cost, and disaggregate data on multimodal transportation. In this study, we develop a semi-supervised deep residual network (ResNet) framework to utilize Wi-Fi communications obtained from smartphones for the purpose of transportation mode detection. This framework is evaluated on data collected by Wi-Fi sensors located in a congested urban area in downtown Toronto. To tackle the intrinsic difficulties and costs associated with labelled data collection, we utilize ample amount of easily collected low-cost unlabelled data by implementing the semi-supervised part of the framework. By incorporating a ResNet architecture as the core of the framework, we take advantage of the high-level features not considered in the traditional machine learning frameworks. The proposed framework shows a promising performance on the collected data, with a prediction accuracy of 81.8% for walking, 82.5% for biking and 86.0% for the driving mode.

accuracy, architecture, configuration, (14 more...)

arXiv.org Machine Learning

1902.06284

Country:

North America > Canada > Ontario > Toronto (0.26)
North America > United States > New York (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.50)

Industry:

Transportation > Infrastructure & Services (0.93)
Information Technology (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Shepherding Hordes of Markov Chains

Ceska, MIlan, Jansen, Nils, Junges, Sebastian, Katoen, Joost-Pieter

arXiv.org Artificial IntelligenceFeb-15-2019

This paper considers large families of Markov chains (MCs) that are defined over a set of parameters with finite discrete domains. Such families occur in software product lines, planning under partial observability, and sketching of probabilistic programs. Simple questions, like `does at least one family member satisfy a property?', are NP-hard. We tackle two problems: distinguish family members that satisfy a given quantitative property from those that do not, and determine a family member that satisfies the property optimally, i.e., with the highest probability or reward. We show that combining two well-known techniques, MDP model checking and abstraction refinement, mitigates the computational complexity. Experiments on a broad set of benchmarks show that in many situations, our approach is able to handle families of millions of MCs, providing superior scalability compared to existing solutions.

artificial intelligence, machine learning, scheduler, (17 more...)

arXiv.org Artificial Intelligence

1902.05727

Country:

Europe > Czechia > South Moravian Region > Brno (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback