Undirected Networks
Memory Fusion Network for Multi-view Sequential Learning
Zadeh, Amir (Carnegie Mellon University) | Liang, Paul Pu (Carnegie Mellon University) | Mazumder, Navonil (Instituto Polite ́cnico Nacional) | Poria, Soujanya (Nanyang Technological University) | Cambria, Erik (Nanyang Technological University) | Morency, Louis-Philippe (Carnegie Mellon University)
Multi-view sequential learning is a fundamental problem in machine learning dealing with multi-view sequences. In a multi-view sequence, there exists two forms of interactions between different views: view-specific interactions and cross-view interactions. In this paper, we present a new neural architecture for multi-view sequential learning called the Memory Fusion Network (MFN) that explicitly accounts for both interactions in a neural architecture and continuously models them through time. The first component of the MFN is called the System of LSTMs, where view-specific interactions are learned in isolation through assigning an LSTM function to each view. The cross-view interactions are then identified using a special attention mechanism called the Delta-memory Attention Network (DMAN) and summarized through time with a Multi-view Gated Memory. Through extensive experimentation, MFN is compared to various proposed approaches for multi-view sequential learning on multiple publicly available benchmark datasets. MFN outperforms all the multi-view approaches. Furthermore, MFN outperforms all current state-of-the-art models, setting new state-of-the-art results for all three multi-view datasets.
Privacy-Preserving Policy Iteration for Decentralized POMDPs
Wu, Feng (University of Science and Technology of China) | Zilberstein, Shlomo (University of Massachusetts Amherst) | Chen, Xiaoping (University of Science and Technology of China)
We propose the first privacy-preserving approach to address the privacy issues that arise in multi-agent planning problems modeled as a Dec-POMDP. Our solution is a distributed message-passing algorithm based on trials, where the agents' policies are optimized using the cross-entropy method. In our algorithm, the agents' private information is protected using a public-key homomorphic cryptosystem. We prove the correctness of our algorithm and analyze its complexity in terms of message passing and encryption/decryption operations. Furthermore, we analyze several privacy aspects of our algorithm and show that it can preserve the agent privacy of non-neighbors, model privacy, and decision privacy. Our experimental results on several common Dec-POMDP benchmark problems confirm the effectiveness of our approach.
POMDP-Based Decision Making for Fast Event Handling in VANETs
Chen, Shuo (Nanyang Technological University) | Irissappane, Athirai A. (University of Washington) | Zhang, Jie (Nanyang Technological University)
Malicious vehicle agents broadcast fake information about traffic events and thereby undermine the benefits of vehicle-to-vehicle communication in vehicular ad-hoc networks (VANETs). Trust management schemes addressing this issue do not focus on effective/fast decision making in reacting to traffic events. We propose a Partially Observable Markov Decision Process (POMDP) based approach to balance the trade-off between information gathering and exploiting actions resulting in faster responses. Our model copes with malicious behavior by maintaining it as part of a small state space, thus is scalable for large VANETs. We also propose an algorithm to learn model parameters in a dynamic behavior setting. Experimental results demonstrate that our model can effectively balance the decision quality and response time while still being robust to sophisticated malicious attacks.
A Poisson Gamma Probabilistic Model for Latent Node-Group Memberships in Dynamic Networks
Yang, Sikun (Technische Universität Darmstadt) | Koeppl, Heinz (Technische Universität Darmstadt)
We present a probabilistic model for learning from dynamic relational data, wherein the observed interactions among networked nodes are modeled via the Bernoulli Poisson link function, and the underlying network structure are characterized by nonnegative latent node-group memberships, which are assumed to be gamma distributed. The latent memberships evolve according to Markov processes.The optimal number of latent groups can be determined by data itself. The computational complexity of our method scales with the number of non-zero links, which makes it scalable to large sparse dynamic relational data. We present batch and online Gibbs sampling algorithms to perform model inference. Finally, we demonstrate the model's performance on both synthetic and real-world datasets compared to state-of-the-art methods.
Reinforcement Learning in POMDPs With Memoryless Options and Option-Observation Initiation Sets
Steckelmacher, Denis (Vrije Universiteit Brussels) | Roijers, Diederik M. (Vrije Universiteit Brussels) | Harutyunyan, Anna (Vrije Universiteit Brussels) | Vrancx, Peter (PROWLER.io) | Plisnier, Hélène (Vrije Universiteit Brussels) | Nowé, Ann (Vrije Universiteit Brussels)
Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the initiation set of options conditional on the previously-executed option, and show that options with such Option-Observation Initiation Sets (OOIs) are at least as expressive as Finite State Controllers (FSCs), a state-of-the-art approach for learning in POMDPs. OOIs are easy to design based on an intuitive description of the task, lead to explainable policies and keep the top-level and option policies memoryless. Our experiments show that OOIs allow agents to learn optimal policies in challenging POMDPs, while being much more sample-efficient than a recurrent neural network over options.
Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition
Li, Chaolong (Southeast University) | Cui, Zhen (Nanjing University of Science and Technology) | Zheng, Wenming (Southeast University) | Xu, Chunyan (Nanjing University of Science and Technology) | Yang, Jian (Nanjing University of Science and Technology)
Variations of human body skeletons may be considered as dynamic graphs, which are generic data representation for numerous real-world applications. In this paper, we propose a spatio-temporal graph convolution (STGC) approach for assembling the successes of local convolutional filtering and sequence learning ability of autoregressive moving average. To encode dynamic graphs, the constructed multi-scale local graph convolution filters, consisting of matrices of local receptive fields and signal mappings, are recursively performed on structured graph data of temporal and spatial domain. The proposed model is generic and principled as it can be generalized into other dynamic models. We theoretically prove the stability of STGC and provide an upper-bound of the signal transformation to be learnt. Further, the proposed recursive model can be stacked into a multi-layer architecture. To evaluate our model, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D. The experimental results demonstrate the effectiveness of our proposed model and the improvement over the state-of-the-art.
An Efficient, Expressive and Local Minima-Free Method for Learning Controlled Dynamical Systems
Hefny, Ahmed (Carnegie Mellon University) | Downey, Carlton (Carnegie Mellon University) | Gordon, Geoffrey (Carnegie Mellon University)
We propose a framework for modeling and estimating the state of controlled dynamical systems, where an agent can affect the system through actions and receives partial observations. Based on this framework, we propose Predictive State Representation with Random Fourier Features (RFF-PSR). A key property in RFF-PSRs is that the state estimate is represented by a conditional distribution of future observations given future actions. RFFPSRs combine this representation with moment-matching, kernel embedding, and local optimization to achieve a method that enjoys several favorable qualities: It can represent controlled environments which can be affected by actions, it has an efficient and theoretically justified learning algorithm, it uses a non-parametric representation that has expressive power to represent continuous non-linear dynamics. We provide a detailed formulation, a theoretical analysis and an experimental evaluation that demonstrates the effectiveness of our method.
Automatic Parameter Tying: A New Approach for Regularized Parameter Learning in Markov Networks
Chou, Li (The University of Texas at Dallas) | Sahoo, Pracheta (The University of Texas at Dallas) | Sarkhel, Somdeb (Adobe Research) | Ruozzi, Nicholas (The University of Texas at Dallas) | Gogate, Vibhav (The University of Texas at Dallas)
Parameter tying is a regularization method in which parameters (weights) of a machine learning model are partitioned into groups by leveraging prior knowledge and all parameters in each group are constrained to take the same value. In this paper, we consider the problem of parameter learning in Markov networks and propose a novel approach called automatic parameter tying (APT) that uses automatic instead of a priori and soft instead of hard parameter tying as a regularization method to alleviate overfitting. The key idea behind APT is to set up the learning problem as the task of finding parameters and groupings of parameters such that the likelihood plus a regularization term is maximized. The regularization term penalizes models where parameter values deviate from their group mean parameter value. We propose and use a block coordinate ascent algorithm to solve the optimization task. We analyze the sample complexity of our new learning algorithm and show that it yields optimal parameters with high probability when the groups are well separated. Experimentally, we show that our method improves upon L 2 regularization and suggest several pragmatic techniques for good practical performance.
Unsupervised Representation Learning With Long-Term Dynamics for Skeleton Based Action Recognition
Zheng, Nenggan (Zhejiang University) | Wen, Jun (Zhejiang University) | Liu, Risheng (Dalian University of Technology) | Long, Liangqu (Zhejiang University) | Dai, Jianhua (Hunan Normal University) | Gong, Zhefeng (Zhejiang University)
Recently, a stream of unsupervised representation learning As an important branch of computer vision, action recognition approaches have been proposed. These methods are formulated has been widely used in many applications, such as intelligent with various objectives. Some models enforce the video surveillance, robot vision, human-computer representations to be temporally smooth and learn slowlyvarying interaction, game control and so on (Weinland, Ronfard, and representations (Földiák 2008), while others learn Boyer 2011; Yang and Tian 2017). Traditional studies about representations through reconstructing past frames or predicting action recognition mainly focus on videos recorded by 2D future frames (Srivastava, Mansimov, and Salakhudinov cameras. The performances are still unsatisfactory, because 2015; Luo et al. 2017). These models receive fixedlength it is difficult to achieve viewpoint and scale invariances as input sequences, and then reconstruct past or predict 2D videos lose some information of 3D space.
An Adversarial Hierarchical Hidden Markov Model for Human Pose Modeling and Generation
Zhao, Rui (Rensselaer Polytechnic Institute) | Ji, Qiang (Rensselaer Polytechnic Institute)
We propose a hierarchical extension to hidden Markov model (HMM) under the Bayesian framework to overcome its limited model capacity. The model parameters are treated as random variables whose distributions are governed by hyperparameters. Therefore the variation in data can be modeled at both instance level and distribution level. We derive a novel learning method for estimating the parameters and hyperparameters of our model based on adversarial learning framework, which has shown promising results in generating photorealistic images and videos. We demonstrate the benefit of the proposed method on human motion capture data through comparison with both state-of-the-art methods and the same model that is learned by maximizing likelihood. The first experiment on reconstruction shows the model's capability of generalizing to novel testing data. The second experiment on synthesis shows the model's capability of generating realistic and diverse data.