Koeppl, Heinz
Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
Linzner, Dominik, Koeppl, Heinz
Continuous-time Bayesian networks (CTBNs) constitute a general and powerful framework for modeling continuous-time stochastic processes on networks. This makes them particularly attractive for learning the directed structures among interacting entities. However, if the available data is incomplete, one needs to simulate the prohibitively complex CTBN dynamics. Existing approximation techniques, such as sampling and low-order variational methods, either scale unfavorably in system size, or are unsatisfactory in terms of accuracy. Inspired by recent advances in statistical physics, we present a new approximation scheme based on cluster-variational methods significantly improving upon existing variational approximations. We can analytically marginalize the parameters of the approximate CTBN, as these are of secondary importance for structure learning. This recovers a scalable scheme for direct structure learning from incomplete and noisy time-series data. Our approach outperforms existing methods in terms of scalability.
Collapsed Variational Inference for Nonparametric Bayesian Group Factor Analysis
Yang, Sikun, Koeppl, Heinz
Group factor analysis (GFA) methods have been widely used to infer the common structure and the group-specific signals from multiple related datasets in various fields including systems biology and neuroimaging. To date, most available GFA models require Gibbs sampling or slice sampling to perform inference, which prevents the practical application of GFA to large-scale data. In this paper we present an efficient collapsed variational inference (CVI) algorithm for the nonparametric Bayesian group factor analysis (NGFA) model built upon an hierarchical beta Bernoulli process. Our CVI algorithm proceeds by marginalizing out the group-specific beta process parameters, and then approximating the true posterior in the collapsed space using mean field methods. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our CVI algorithm for the NGFA compared with state-of-the-art GFA methods.
Inverse Reinforcement Learning via Nonparametric Subgoal Modeling
Šošić, Adrian (Technische Universität Darmstadt) | Zoubir, Abdelhak M. (Technische Universität Darmstadt) | Koeppl, Heinz (Technische Universität Darmstadt)
Recent advances in the field of inverse reinforcement learning (IRL) have yielded sophisticated frameworks which relax the original modeling assumption that the behavior of an observed agent reflects only a single intention. Instead, the demonstration data is separated into parts to account for the fact that different trajectories may correspond to different intentions, e.g., because they were generated by different domain experts. In this work, we go one step further: using the intuitive concept of subgoals, we build upon the premise that even a single trajectory can be explained more efficiently locally within a certain context than globally, enabling a more compact representation of the observed behavior. Based on this assumption, we build an implicit intentional model of the agent's goals to forecast its behavior in unobserved situations. The result is an integrated Bayesian prediction framework which provides spatially smooth policy estimates that are consistent with the expert's plan and significantly outperform existing IRL solutions. In addition, the framework can be naturally extended to handle scenarios with time-varying expert intentions.
Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling
Šošić, Adrian, Rueckert, Elmar, Peters, Jan, Zoubir, Abdelhak M., Koeppl, Heinz
Recent advances in the field of inverse reinforcement learning (IRL) have yielded sophisticated frameworks which relax the original modeling assumption that the behavior of an observed agent reflects only a single intention. Instead, the demonstration data is typically divided into parts, to account for the fact that different trajectories may correspond to different intentions, e.g., because they were generated by different domain experts. In this work, we go one step further: using the intuitive concept of subgoals, we build upon the premise that even a single trajectory can be explained more efficiently locally within a certain context than globally, enabling a more compact representation of the observed behavior. Based on this assumption, we build an implicit intentional model of the agent's goals to forecast its behavior in unobserved situations. The result is an integrated Bayesian prediction framework which provides smooth policy estimates that are consistent with the expert's plan and significantly outperform existing IRL solutions. Most notably, our framework naturally handles situations where the intentions of the agent change with time and classical IRL algorithms fail. In addition, due to its probabilistic nature, the model can be straightforwardly applied in an active learning setting to guide the demonstration process of the expert.
A Poisson Gamma Probabilistic Model for Latent Node-Group Memberships in Dynamic Networks
Yang, Sikun (Technische Universität Darmstadt) | Koeppl, Heinz (Technische Universität Darmstadt)
We present a probabilistic model for learning from dynamic relational data, wherein the observed interactions among networked nodes are modeled via the Bernoulli Poisson link function, and the underlying network structure are characterized by nonnegative latent node-group memberships, which are assumed to be gamma distributed. The latent memberships evolve according to Markov processes.The optimal number of latent groups can be determined by data itself. The computational complexity of our method scales with the number of non-zero links, which makes it scalable to large sparse dynamic relational data. We present batch and online Gibbs sampling algorithms to perform model inference. Finally, we demonstrate the model's performance on both synthetic and real-world datasets compared to state-of-the-art methods.
A Bayesian Approach to Policy Recognition and State Representation Learning
Šošić, Adrian, Zoubir, Abdelhak M., Koeppl, Heinz
Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used e.g. for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g. they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data. Moreover, we show that our methodology can be applied in a nonparametric context to infer the complexity of the state representation used by the expert, and to learn task-appropriate partitionings of the system state space.
Inverse Reinforcement Learning in Swarm Systems
Šošić, Adrian, KhudaBukhsh, Wasiur R., Zoubir, Abdelhak M., Koeppl, Heinz
Inverse reinforcement learning (IRL) has become a useful tool for learning behavioral models from demonstration data. However, IRL remains mostly unexplored for multi-agent systems. In this paper, we show how the principle of IRL can be extended to homogeneous large-scale problems, inspired by the collective swarming behavior of natural systems. In particular, we make the following contributions to the field: 1) We introduce the swarMDP framework, a sub-class of decentralized partially observable Markov decision processes endowed with a swarm characterization. 2) Exploiting the inherent homogeneity of this framework, we reduce the resulting multi-agent IRL problem to a single-agent one by proving that the agent-specific value functions in this model coincide. 3) To solve the corresponding control problem, we propose a novel heterogeneous learning scheme that is particularly tailored to the swarm setting. Results on two example systems demonstrate that our framework is able to produce meaningful local reward models from which we can replicate the observed global system dynamics.
Marginalized Continuous Time Bayesian Networks for Network Reconstruction from Incomplete Observations
Studer, Lukas (IBM Research Zurich) | Paulevé, Loic (Université Paris-Saclay) | Zechner, Christoph (ETH Zurich) | Reumann, Matthias (IBM Research Zurich) | Martínez, María Rodríguez (IBM Research Zurich) | Koeppl, Heinz (Technische Universitaet Darmstadt)
Continuous Time Bayesian Networks (CTBNs) provide a powerful means to model complex network dynamics. How- ever, their inference is computationally demanding — especially if one considers incomplete and noisy time-series data. The latter gives rise to a joint state- and parameter estimation problem, which can only be solved numerically. Yet, finding the exact parameterization of the CTBN has often only secondary importance in practical scenarios. We therefore focus on the structure learning problem and present a way to analytically marginalize the Markov chain underlying the CTBN model with respect its parameters. Since the resulting stochastic process is parameter-free, its inference reduces to an optimal filtering problem. We solve the latter using an efficient parallel implementation of a sequential Monte Carlo scheme. Our framework enables CTBN inference to be applied to incomplete noisy time-series data frequently found in molecular biology and other disciplines.