Markov Models
An Efficient, Expressive and Local Minima-Free Method for Learning Controlled Dynamical Systems
Hefny, Ahmed (Carnegie Mellon University) | Downey, Carlton (Carnegie Mellon University) | Gordon, Geoffrey (Carnegie Mellon University)
We propose a framework for modeling and estimating the state of controlled dynamical systems, where an agent can affect the system through actions and receives partial observations. Based on this framework, we propose Predictive State Representation with Random Fourier Features (RFF-PSR). A key property in RFF-PSRs is that the state estimate is represented by a conditional distribution of future observations given future actions. RFFPSRs combine this representation with moment-matching, kernel embedding, and local optimization to achieve a method that enjoys several favorable qualities: It can represent controlled environments which can be affected by actions, it has an efficient and theoretically justified learning algorithm, it uses a non-parametric representation that has expressive power to represent continuous non-linear dynamics. We provide a detailed formulation, a theoretical analysis and an experimental evaluation that demonstrates the effectiveness of our method.
Automatic Parameter Tying: A New Approach for Regularized Parameter Learning in Markov Networks
Chou, Li (The University of Texas at Dallas) | Sahoo, Pracheta (The University of Texas at Dallas) | Sarkhel, Somdeb (Adobe Research) | Ruozzi, Nicholas (The University of Texas at Dallas) | Gogate, Vibhav (The University of Texas at Dallas)
Parameter tying is a regularization method in which parameters (weights) of a machine learning model are partitioned into groups by leveraging prior knowledge and all parameters in each group are constrained to take the same value. In this paper, we consider the problem of parameter learning in Markov networks and propose a novel approach called automatic parameter tying (APT) that uses automatic instead of a priori and soft instead of hard parameter tying as a regularization method to alleviate overfitting. The key idea behind APT is to set up the learning problem as the task of finding parameters and groupings of parameters such that the likelihood plus a regularization term is maximized. The regularization term penalizes models where parameter values deviate from their group mean parameter value. We propose and use a block coordinate ascent algorithm to solve the optimization task. We analyze the sample complexity of our new learning algorithm and show that it yields optimal parameters with high probability when the groups are well separated. Experimentally, we show that our method improves upon L 2 regularization and suggest several pragmatic techniques for good practical performance.
Unsupervised Representation Learning With Long-Term Dynamics for Skeleton Based Action Recognition
Zheng, Nenggan (Zhejiang University) | Wen, Jun (Zhejiang University) | Liu, Risheng (Dalian University of Technology) | Long, Liangqu (Zhejiang University) | Dai, Jianhua (Hunan Normal University) | Gong, Zhefeng (Zhejiang University)
Recently, a stream of unsupervised representation learning As an important branch of computer vision, action recognition approaches have been proposed. These methods are formulated has been widely used in many applications, such as intelligent with various objectives. Some models enforce the video surveillance, robot vision, human-computer representations to be temporally smooth and learn slowlyvarying interaction, game control and so on (Weinland, Ronfard, and representations (Fรถldiรกk 2008), while others learn Boyer 2011; Yang and Tian 2017). Traditional studies about representations through reconstructing past frames or predicting action recognition mainly focus on videos recorded by 2D future frames (Srivastava, Mansimov, and Salakhudinov cameras. The performances are still unsatisfactory, because 2015; Luo et al. 2017). These models receive fixedlength it is difficult to achieve viewpoint and scale invariances as input sequences, and then reconstruct past or predict 2D videos lose some information of 3D space.
An Adversarial Hierarchical Hidden Markov Model for Human Pose Modeling and Generation
Zhao, Rui (Rensselaer Polytechnic Institute) | Ji, Qiang (Rensselaer Polytechnic Institute)
We propose a hierarchical extension to hidden Markov model (HMM) under the Bayesian framework to overcome its limited model capacity. The model parameters are treated as random variables whose distributions are governed by hyperparameters. Therefore the variation in data can be modeled at both instance level and distribution level. We derive a novel learning method for estimating the parameters and hyperparameters of our model based on adversarial learning framework, which has shown promising results in generating photorealistic images and videos. We demonstrate the benefit of the proposed method on human motion capture data through comparison with both state-of-the-art methods and the same model that is learned by maximizing likelihood. The first experiment on reconstruction shows the model's capability of generalizing to novel testing data. The second experiment on synthesis shows the model's capability of generating realistic and diverse data.
Sequence-to-Point Learning With Neural Networks for Non-Intrusive Load Monitoring
Zhang, Chaoyun (University of Edinburgh) | Zhong, Mingjun (University of Lincoln) | Wang, Zongzuo (University of Edinburgh) | Goddard, Nigel (University of Edinburgh) | Sutton, Charles (University of Edinburgh)
Energy disaggregation (a.k.a nonintrusive load monitoring, NILM), a single-channel blind source separation problem, aims to decompose the mains which records the whole house electricity consumption into appliance-wise readings. This problem is difficult because it is inherently unidentifiable. Recent approaches have shown that the identifiability problem could be reduced by introducing domain knowledge into the model. Deep neural networks have been shown to be a promising approach for these problems, but sliding windows are necessary to handle the long sequences which arise in signal processing problems, which raises issues about how to combine predictions from different sliding windows. In this paper, we propose sequence-to-point learning, where the input is a window of the mains and the output is a single point of the target appliance. We use convolutional neural networks to train the model. Interestingly, we systematically show that the convolutional neural networks can inherently learn the signatures of the target appliances, which are automatically added into the model to reduce the identifiability problem. We applied the proposed neural network approaches to real-world household energy data, and show that the methods achieve state-of-the-art performance, improving two standard error measures by 84% and 92%.
Learning Datum-Wise Sampling Frequency for Energy-Efficient Human Activity Recognition
Cheng, Weihao (The University of Melbourne) | Erfani, Sarah (The University of Melbourne) | Zhang, Rui (The University of Melbourne) | Kotagiri, Ramamohanarao (The University of Melbourne)
Continuous Human Activity Recognition (HAR) is an important application of smart mobile/wearable systems for providing dynamic assistance to users. However, HAR in real-time requires continuous sampling of data using built-in sensors (e.g., accelerometer), which significantly increases the energy cost and shortens the operating span. Reducing sampling rate can save energy but causes low recognition accuracy. Therefore, choosing adaptive sampling frequency that balances accuracy and energy efficiency becomes a critical problem in HAR. In this paper, we formalize the problem as minimizing both classification error and energy cost by choosing dynamically appropriate sampling rates. We propose Datum-Wise Frequency Selection (DWFS) to solve the problem via a continuous state Markov Decision Process (MDP). A policy function is learned from the MDP, which selects the best frequency for sampling an incoming data entity by exploiting a datum related state of the system. We propose a method for alternative learning the parameters of an activity classification model and the MDP that improves both the accuracy and the energy efficiency. We evaluate DWFS with three real-world HAR datasets, and the results show that DWFS statistically outperforms the state-of-the-arts regarding a combined measurement of accuracy and energy efficiency.
Splitting an LPMLN Program
Wang, Bin (Southeast University) | Zhang, Zhizheng (Southeast University) | Xu, Hongxiang (Southeast University) | Shen, Jun (Southeast University)
The technique called splitting sets has been proven useful in simplifying the investigation of Answer Set Programming (ASP). In this paper, we investigate the splitting set theorem for LP MLN that is a new extension of ASP created by combining the ideas of ASP and Markov Logic Networks (MLN). Firstly, we extend the notion of splitting sets to LP MLN programs and present the splitting set theorem for LP MLN . Then, the use of the theorem for simplifying several LP MLN inference tasks is illustrated. After that, we give two parallel approaches for solving LP MLN programs via using the theorem. The preliminary experimental results show that these approaches are alternative ways to promote an LP MLN solver.
LTLf/LDLf Non-Markovian Rewards
Brafman, Ronen I. (Ben-Gurion University) | Giacomo, Giuseppe De (Sapienza University of Rome) | Patrizi, Fabio (Sapienza University of Rome)
In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian, i.e., depends on the last state and action. This dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle non-Markovian reward functions was the subject of two previous lines of work. Both use LTL variants to specify the reward function and then compile the new model back into a Markovian model. Building on recent progress in temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees.
A Low-Cost Ethics Shaping Approach for Designing Reinforcement Learning Agents
Wu, Yueh-Hua (National Taiwan University) | Lin, Shou-De (National Taiwan University)
This paper proposes a low-cost, easily realizable strategy to equip a reinforcement learning (RL) agent the capability of behaving ethically. Our model allows the designers of RL agents to solely focus on the task to achieve, without having to worry about the implementation of multiple trivial ethical patterns to follow. Based on the assumption that the majority of human behavior, regardless which goals they are achieving, is ethical, our design integrates human policy with the RL policy to achieve the target objective with less chance of violating the ethical code that human beings normally obey.
Understanding Social Interpersonal Interaction via Synchronization Templates of Facial Events
Li, Rui (Rochester Institute of Technology) | Curhan, Jared (Massachussets Institute of Technology) | Hoque, Mohammed Ehsan (University of Rochester)
Automatic facial expression analysis in inter-personal communication is challenging. Not only because conversation partners' facial expressions mutually influence each other, but also because no correct interpretation of facial expressions is possible without taking social context into account. In this paper, we propose a probabilistic framework to model interactional synchronization between conversation partners based on their facial expressions. Interactional synchronization manifests temporal dynamics of conversation partners' mutual influence. In particular, the model allows us to discover a set of common and unique facial synchronization templates directly from natural interpersonal interaction without recourse to any predefined labeling schemes. The facial synchronization templates represent periodical facial event coordinations shared by multiple conversation pairs in a specific social context. We test our model on two different dyadic conversations of negotiation and job-interview. Based on the discovered facial event coordination, we are able to predict their conversation outcomes with higher accuracy than HMMs and GMMs.