Undirected Networks
Quantifying the relationship between student enrollment patterns and student performance
Boumi, Shahab, Vela, Adan, Chini, Jacquelyn
College students are enrolled at each semester with either part time or full time status. While most of the students keep an overall constant enrollment status during their education period, some of them may frequently change their status between full time and part time from one semester to the next. The goal of this research is to exploit the historic patterns to estimate and categorize students$'$ strategy in three different groups of part time, full time and mixed, investigate the educational features of each group and compare their performance. Enrollment strategy refers to the student$'$s mindset for enrollment plan and in one way can be captured from the student$'$s historic enrollment status. Data is collected from the University of Central Florida from 2008 to 2017 and Hidden Markov Model is applied to identify different types of student strategy. Results show that students with Mixed Enrollment Strategy (MES) have features (ex. time to graduation and graduation and halt enrollment ratio) and performances (ex. cumulative GPA) relatively between students with Full time Enrollment Strategy (FES) and students with Part time Enrollment Strategy (PES).
Adaptive Informative Path Planning with Multimodal Sensing
Choudhury, Shushman, Gruver, Nate, Kochenderfer, Mykel J.
Adaptive Informative Path Planning (AIPP) problems model an agent tasked with obtaining information subject to resource constraints in unknown, partially observable environments. Existing work on AIPP has focused on representing observations about the world as a result of agent movement. We formulate the more general setting where the agent may choose between different sensors at the cost of some energy, in addition to traversing the environment to gather information. We call this problem AIPPMS (MS for Multimodal Sensing). AIPPMS requires reasoning jointly about the effects of sensing and movement in terms of both energy expended and information gained. We frame AIPPMS as a Partially Observable Markov Decision Process (POMDP) and solve it with online planning. Our approach is based on the Partially Observable Monte Carlo Planning framework with modifications to ensure constraint feasibility and a heuristic rollout policy tailored for AIPPMS. We evaluate our method on two domains: a simulated search-and-rescue scenario and a challenging extension to the classic RockSample problem. We find that our approach outperforms a classic AIPP algorithm that is modified for AIPPMS, as well as online planning using a random rollout policy.
Gradient-based Adaptive Markov Chain Monte Carlo
Titsias, Michalis, Dellaportas, Petros
We introduce a gradient-based learning method to automatically adapt Markov chain Monte Carlo (MCMC) proposal distributions to intractable targets. We define a maximum entropy regularised objective function, referred to as generalised speed measure, which can be robustly optimised over the parameters of the proposal distribution by applying stochastic gradient optimisation. An advantage of our method compared to traditional adaptive MCMC methods is that the adaptation occurs even when candidate state values are rejected. This is a highly desirable property of any adaptation strategy because the adaptation starts in early iterations even if the initial proposal distribution is far from optimum. We apply the framework for learning multivariate random walk Metropolis and Metropolis-adjusted Langevin proposals with full covariance matrices, and provide empirical evidence that our method can outperform other MCMC algorithms, including Hamiltonian Monte Carlo schemes.
Maximum likelihood trajectories for continuous-time Markov chains
Continuous-time Markov chains are used to model systems in which transitions between states as well as the time the system spends in each state are random. Many computational problems related to such chains have been solved, including determining state distributions as a function of time, parameter estimation, and control. However, the problem of inferring most likely trajectories, where a trajectory is a sequence of states as well as the amount of time spent in each state, appears unsolved. We study three versions of this problem: (i) an initial value problem, in which an initial state is given and we seek the most likely trajectory until a given final time, (ii) a boundary value problem, in which initial and final states and times are given, and we seek the most likely trajectory connecting them, and (iii) trajectory inference under partial observability, analogous to finding maximum likelihood trajectories for hidden Markov models. We show that maximum likelihood trajectories are not always well-defined, and describe a polynomial time test for well-definedness. When well-definedness holds, we show that each of the three problems can be solved in polynomial time, and we develop efficient dynamic programming algorithms for doing so.
Implicit Mixtures of Restricted Boltzmann Machines
Nair, Vinod, Hinton, Geoffrey E.
We present a mixture model whose components are Restricted Boltzmann Machines (RBMs). This possibility has not been considered before because computing the partition function of an RBM is intractable, which appears to make learning a mixture of RBMs intractable as well. Surprisingly, when formulated as a third-order Boltzmann machine, such a mixture model can be learned tractably using contrastive divergence. The energy function of the model captures three-way interactions among visible units, hidden units, and a single hidden multinomial unit that represents the cluster labels. The distinguishing feature of this model is that, unlike other mixture models, the mixing proportions are not explicitly parameterized.
Sparse Signal Recovery Using Markov Random Fields
Cevher, Volkan, Duarte, Marco F., Hegde, Chinmay, Baraniuk, Richard
Compressive Sensing (CS) combines sampling and compression into a single sub-Nyquist linear measurement process for sparse and compressible signals. In this paper, we extend the theory of CS to include signals that are concisely represented in terms of a graphical model. In particular, we use Markov Random Fields (MRFs) to represent sparse signals whose nonzero coefficients are clustered. Our new model-based reconstruction algorithm, dubbed Lattice Matching Pursuit (LaMP), stably recovers MRF-modeled signals using many fewer measurements and computations than the current state-of-the-art algorithms. Papers published at the Neural Information Processing Systems Conference.
BehaveNet: nonlinear embedding and Bayesian neural decoding of behavioral videos
Batty, Eleanor, Whiteway, Matthew, Saxena, Shreya, Biderman, Dan, Abe, Taiga, Musall, Simon, Gillis, Winthrop, Markowitz, Jeffrey, Churchland, Anne, Cunningham, John P., Datta, Sandeep R., Linderman, Scott, Paninski, Liam
A fundamental goal of systems neuroscience is to understand the relationship between neural activity and behavior. Behavior has traditionally been characterized by low-dimensional, task-related variables such as movement speed or response times. More recently, there has been a growing interest in automated analysis of high-dimensional video data collected during experiments. Here we introduce a probabilistic framework for the analysis of behavioral video and neural activity. This framework provides tools for compression, segmentation, generation, and decoding of behavioral videos.
Integrating Markov processes with structural causal modeling enables counterfactual inference in complex systems
Ness, Robert, Paneri, Kaushal, Vitek, Olga
This manuscript contributes a general and practical framework for casting a Markov process model of a system at equilibrium as a structural causal model, and carrying out counterfactual inference. Markov processes mathematically describe the mechanisms in the system, and predict the system's equilibrium behavior upon intervention, but do not support counterfactual inference. In contrast, structural causal models support counterfactual inference, but do not identify the mechanisms. We define the structural causal models in terms of the parameters and the equilibrium dynamics of the Markov process models, and counterfactual inference flows from these settings. The proposed approach alleviates the identifiability drawback of the structural causal models, in that the counterfactual inference is consistent with the counterfactual trajectories simulated from the Markov process model. We illustrate that, in presence of Markov process model misspecification, counterfactual inference leverages prior data, and therefore estimates the outcome of an intervention more accurately than a direct simulation.
Combining Generative and Discriminative Models for Hybrid Inference
Satorras, Victor Garcia, Akata, Zeynep, Welling, Max
A graphical model is a structured representation of the data generating process. The traditional method to reason over random variables is to perform inference in this graphical model. However, in many cases the generating process is only a poor approximation of the much more complex true data generating process, leading to suboptimal estimation. The subtleties of the generative process are however captured in the data itself and we can learn to infer'', that is, learn a direct mapping from observations to explanatory latent variables. In this work we propose a hybrid model that combines graphical inference with a learned inverse model, which we structure as in a graph neural network, while the iterative algorithm as a whole is formulated as a recurrent neural network.
Learning Multiple Markov Chains via Adaptive Allocation
Talebi, Mohammad Sadegh, Maillard, Odalric-Ambrym
We study the problem of learning the transition matrices of a set of Markov chains from a single stream of observations on each chain. We assume that the Markov chains are ergodic but otherwise unknown. The learner can sample Markov chains sequentially to observe their states. The goal of the learner is to sequentially select various chains to learn transition matrices uniformly well with respect to some loss function. We introduce a notion of loss that naturally extends the squared loss for learning distributions to the case of Markov chains, and further characterize the notion of being \emph{uniformly good} in all problem instances.