Bayesian networks are a popular representation of asymmetric (for example causal) relationships between random variables. Markov random fields (MRFs) are a complementary model of symmetric relationships used in computer vision, spatial modeling, and social and gene expression networks. A chain graph model under the Lauritzen-Wermuth-Frydenberg interpretation (hereafter a chain graph model) generalizes both Bayesian networks and MRFs, and can represent asymmetric and symmetric relationships together.As in other graphical models, the set of marginals from distributions in a chain graph model induced by the presence of hidden variables forms a complex model. One recent approach to the study of marginal graphical models is to consider a well-behaved supermodel. Such a supermodel of marginals of Bayesian networks, defined only by conditional independences, and termed the ordinary Markov model, was studied at length in (Evans and Richardson, 2014).In this paper, we show that special mixed graphs which we call segregated graphs can be associated, via a Markov property, with supermodels of a marginal of chain graphs defined only by conditional independences. Special features of segregated graphs imply the existence of a very natural factorization for these supermodels, and imply many existing results on the chain graph model, and ordinary Markov model carry over. Our results suggest that segregated graphs define an analogue of the ordinary Markov model for marginals of chain graph models.
Accurate and detailed models of the progression of neurodegenerative diseases such as Alzheimer's (AD) are crucially important for reliable early diagnosis and the determination and deployment of effective treatments. In this paper, we introduce the ALPACA (Alzheimer's disease Probabilistic Cascades) model, a generative model linking latent Alzheimer's progression dynamics to observable biomarker data. In contrast with previous works which model disease progression as a fixed ordering of events, we explicitly model the variability over such orderings among patients which is more realistic, particularly for highly detailed disease progression models. We describe efficient learning algorithms for ALPACA and discuss promising experimental results on a real cohort of Alzheimer's patients from the Alzheimer's Disease Neuroimaging Initiative.
Leslie Grate and Mark Herbster and Richard Hughey and David Haussler Baskin (;enter for Computer Engineering and Computer and Information Sciences University of California Santa Cruz, CA 95064 Keywords: RNA secondary structure, Gibbs sampler, Expectation Maximization, stochastic contextfree grammars, hidden Markov models, tP NA, snRNA, 16S rRNA, linguistic methods Abstract A new method of discovering the common secondary structure of a family of homologous RNA sequences using Gibbs sampling and stochastic context-free grammars is proposed. These parameters describe a statistical model of the family. After the Gibbs sampling has produced a crude statistical model for the family, this model is translated into a stochastic context-free grammar, which is then refined by an Expectation Maximization (EM) procedure produce a more complete model. A prototype implementation of the method is tested on tRNA, pieces of 16S rRNA and on U5 snRNA with good results. I. Saira Mian and Harry Noller Sinsheimer Laboratories University of California Santa Cruz, CA 95064 Introduction Tools for analyzing RNA are becoming increasingly important as in vitro evolution and selection techniques produce greater numbers of synthesized RNA families to supplement those related by phylogeny. Two principal methods have been established for predicting RNA secondary structure base pairings. The second technique employs thermodynamics to compare the free energy changes predicted for formation of possible s,'covdary structure and relies on finding the structure with the lowest free energy (Tinoco Jr., Uhlenbeck, & Levine 1971: Turner, Sugimoto, & Freier 1988; *This work was supported in part by NSF grants C,I)A-9115268 and IR1-9123692, and NIIt gratnt (.;M17129. When several related sequences are available that all share a common secondary structure, combinations of different approaches have been used to obtain improved results (Waterman 1989; Le & Zuker 1991; Han& Kim 1993; Chiu & Kolodziejczak 1991; Sankoff 1985; Winker et al. 1990; Lapedes 1992; Klinger & Brutlag 1993; Gutell et aL 1992). Recent efforts have applied Stochastic Context-Free Grammars (SCFGs) to the problems of statistical modeling, multiple alignment, discrimination and prediction of the secondary structure of RNA families (Sakakibara el al. 1994; 1993; Eddy & Durbin 1994; Searls 1993).
We present a new statistical framework called hidden Markov Dirichlet process (HMDP) to jointly model the genetic recombinations among possibly infinite number of founders and the coalescence-with-mutation events in the resulting genealogies. TheHMDP posits that a haplotype of genetic markers is generated by a sequence of recombination events that select an ancestor for each locus from an unbounded set of founders according to a 1st-order Markov transition process. Conjoining this process with a mutation model, our method accommodates both between-lineage recombination and within-lineage sequence variations, and leads to a compact and natural interpretation of the population structure and inheritance process underlying haplotype data. We have developed an efficient sampling algorithm forHMDP based on a two-level nested Pólya urn scheme. On both simulated and real SNP haplotype data, our method performs competitively or significantly better than extant methods in uncovering the recombination hotspots along chromosomal loci;and in addition it also infers the ancestral genetic patterns and offers a highly accurate map of ancestral compositions of modern populations.
We introduce the Gamma-Exponential Process (GEP), a prior over a large family ofcontinuous time stochastic processes. A hierarchical version of this prior (HGEP; the Hierarchical GEP) yields a useful model for analyzing complex time series. Models based on HGEPs display many attractive properties: conjugacy, exchangeability and closed-form predictive distribution for the waiting times, and exact Gibbs updates for the time scale parameters. After establishing these properties, weshow how posterior inference can be carried efficiently using Particle MCMC methods . This yields a MCMC algorithm that can resample entire sequences atomicallywhile avoiding the complications of introducing slice and stick auxiliary variables of the beam sampler . We applied our model to the problem of estimating the disease progression in multiple sclerosis , and to RNA evolutionary modeling. In both domains, we found that our model outperformed the standard rate matrix estimation approach.