Undirected Networks
Arcades: A deep model for adaptive decision making in voice controlled smart-home
Brenon, Alexis, Portet, François, Vacher, Michel
Smart-home is an application domain which brings together home automation and ambient intelligence to ease life of dwellers and to provide support to people in loss of autonomy. The development of smarthomes is not only a cultural and technological evolution but is also recognized as one way to address the challenges created by an aging population in developed countries [42]. If home automation is concerned with sensing (sensors, actuators, middle-ware) and low-level automation (heating control, lighting control), Ambient Intelligence should provide perception and reasoning capabilities into the smart-home ecosystem. However, although the development of smart-homes is supported by a large amount of research and industrial projects, it has not reached a large public since many challenges are still to be addressed. One of the main challenges is due to the complexity of setting up the smart-home system in case of new situations (devices, house, dwellers, after an accident, etc.).
Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding
Hou, Yutai, Liu, Yijia, Che, Wanxiang, Liu, Ting
In this paper, we study the problem of data augmentation for language understanding in task-oriented dialogue system. In contrast to previous work which augments an utterance without considering its relation with other utterances, we propose a sequence-to-sequence generation based data augmentation framework that leverages one utterance's same semantic alternatives in the training data. A novel diversity rank is incorporated into the utterance representation to make the model produce diverse utterances and these diversely augmented utterances help to improve the language understanding module. Experimental results on the Airline Travel Information System dataset and a newly created semantic frame annotation on Stanford Multi-turn, Multidomain Dialogue Dataset show that our framework achieves significant improvements of 6.38 and 10.04 F-scores respectively when only a training set of hundreds utterances is represented. Case studies also confirm that our method generates diverse utterances.
Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation
Wang, Lu, Zhang, Wei, He, Xiaofeng, Zha, Hongyuan
Dynamic treatment recommendation systems based on large-scale electronic health records (EHRs) become a key to successfully improve practical clinical outcomes. Prior relevant studies recommend treatments either use supervised learning (e.g. matching the indicator signal which denotes doctor prescriptions), or reinforcement learning (e.g. maximizing evaluation signal which indicates cumulative reward from survival rates). However, none of these studies have considered to combine the benefits of supervised learning and reinforcement learning. In this paper, we propose Supervised Reinforcement Learning with Recurrent Neural Network (SRL-RNN), which fuses them into a synergistic learning framework. Specifically, SRL-RNN applies an off-policy actor-critic framework to handle complex relations among multiple medications, diseases and individual characteristics. The "actor" in the framework is adjusted by both the indicator signal and evaluation signal to ensure effective prescription and low mortality. RNN is further utilized to solve the Partially-Observed Markov Decision Process (POMDP) problem due to the lack of fully observed states in real world applications. Experiments on the publicly real-world dataset, i.e., MIMIC-3, illustrate that our model can reduce the estimated mortality, while providing promising accuracy in matching doctors' prescriptions.
Can Markov Logic Take Machine Learning to the Next Level?
Advances in machine learning, including deep learning, have propelled artificial intelligence (AI) into the public conscience and forced executives to create new business plans based on data. However, the scarcity of highly trained data scientists has stymied many machine learning implementations, potentially blocking future AI development. Now a group of academics and technologist say the emerging fields of Markov Logic and probabilistic programming could lower the bar for implementing machine learning. Markov Logic is a language first described in by two professors in the University of Washington's Department of Computer Science and Engineering, Pedro Domingos and Matthew Richardson, in their seminal 2006 paper "Markov Logic Networks." The work is based on mathematical discoveries made by Andrey Markov Jr., the Soviet mathematician who died in 1979 (his father, who had the same name, is associated with a related field, dubbed Markov chains).
Markov Logic Networks with Statistical Quantifiers
Gutiérrez-Basulto, Víctor, Jung, Jean Christoph, Kuzelka, Ondrej
Markov Logic Networks (MLNs) are well-suited for expressing statistics such as "with high probability a smoker knows another smoker" but not for expressing statements such as "there is a smoker who knows most other smokers", which is necessary for modeling, e.g. influencers in social networks. To overcome this shortcoming, we investigate quantified MLNs which generalize MLNs by introducing statistical universal quantifiers, allowing to express also the latter type of statistics in a principled way. Our main technical contribution is to show that the standard reasoning tasks in quantified MLNs, maximum a posteriori and marginal inference, can be reduced to their respective MLN counterparts in polynomial time.
Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
Rabusseau, Guillaume, Li, Tianyu, Precup, Doina
In this paper, we unravel a fundamental connection between weighted finite automata (WFAs) and second-order recurrent neural networks (2-RNNs): in the case of sequences of discrete symbols, WFAs and 2-RNNs with linear activation functions are expressively equivalent. Motivated by this result, we build upon a recent extension of the spectral learning algorithm to vector-valued WFAs and propose the first provable learning algorithm for linear 2-RNNs defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the so-called Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed method are assessed in a simulation study.
Scalable Structure Learning for Probabilistic Soft Logic
Embar, Varun, Sridhar, Dhanya, Farnadi, Golnoosh, Getoor, Lise
Statistical relational frameworks such as Markov logic networks and probabilistic soft logic (PSL) encode model structure with weighted first-order logical clauses. Learning these clauses from data is referred to as structure learning. Structure learning alleviates the manual cost of specifying models. However, this benefit comes with high computational costs; structure learning typically requires an expensive search over the space of clauses which involves repeated optimization of clause weights. In this paper, we propose the first two approaches to structure learning for PSL. We introduce a greedy search-based algorithm and a novel optimization method that trade-off scalability and approximations to the structure learning problem in varying ways. The highly scalable optimization method combines data-driven generation of clauses with a piecewise pseudolikelihood (PPLL) objective that learns model structure by optimizing clause weights only once. We compare both methods across five real-world tasks, showing that PPLL achieves an order of magnitude runtime speedup and AUC gains up to 15% over greedy search.
Structure Learning of Markov Random Fields through Grow-Shrink Maximum Pseudolikelihood Estimation
Takashina, Yuya, Nakatani, Shuyo, Inoue, Masato
Learning the structure of Markov random fields (MRFs) plays an important role in multivariate analysis. The importance has been increasing with the recent rise of statistical relational models since the MRF serves as a building block of these models such as Markov logic networks. There are two fundamental ways to learn structures of MRFs: methods based on parameter learning and those based on independence test. The former methods more or less assume certain forms of distribution, so they potentially perform poorly when the assumption is not satisfied. The latter can learn an MRF structure without a strong distributional assumption, but sometimes it is unclear what objective function is maximized/minimized in these methods. In this paper, we follow the latter, but we explicitly define the optimization problem of MRF structure learning as maximum pseudolikelihood estimation (MPLE) with respect to the edge set. As a result, the proposed solution successfully deals with the symmetricity in MRFs, whereas such symmetricity is not explicitly taken into account in most existing independence test techniques. The proposed method achieved higher accuracy than previous methods when there were asymmetric dependencies in our experiments.
Block-Value Symmetries in Probabilistic Graphical Models
Madan, Gagan, Anand, Ankit, Mausam, null, Singla, Parag
Several lifted inference algorithms for probabilistic graphical models first merge symmetric states into a single cluster (orbit) and then use these for downstream inference, via variations of orbital MCMC [Niepert, 2012]. These orbits are represented compactly using permutations over variables, and variable-value (VV) pairs, but these can miss several state symmetries in a domain. We define the notion of permutations over block-value (BV) pairs, where a block is a set of variables. BV strictly generalizes VV symmetries, and can compute many more symmetries for increasing block sizes. To operationalize use of BV permutations in lifted inference, we describe 1) an algorithm to compute BV permutations given a block partition of the variables, 2) BV-MCMC, an extension of orbital MCMC that can sample from BV orbits, and 3) a heuristic to suggest good block partitions. Our experiments show that BV-MCMC can mix much faster compared to vanilla MCMC and orbital MCMC over VV permutations.