There is evidence that the numbers in probabilistic inference don't really matter. This paper considers the idea that we can make a probabilistic model simpler by making fewer distinctions. Unfortunately, the level of a Bayesian network seems too coarse; it is unlikely that a parent will make little difference for all values of the other parents. In this paper we consider an approximation scheme where distinctions can be ignored in some contexts, but not in other contexts. We elaborate on a notion of a parent context that allows a structured context-specific decomposition of a probability distribution and the associated probabilistic inference scheme called probabilistic partial evaluation (Poole 1997). This paper shows a way to simplify a probabilistic model by ignoring distinctions which have similar probabilities, a method to exploit the simpler model, a bound on the resulting errors, and some preliminary empirical results on simple networks.

Zhang, Cheng, IV, Frederick A Matsen

Probability estimation is one of the fundamental tasks in statistics and machine learning. However, standard methods for probability estimation on discrete objects do not handle object structure in a satisfactory manner. In this paper, we derive a general Bayesian network formulation for probability estimation on leaf-labeled trees that enables flexible approximations which can generalize beyond observations. We show that efficient algorithms for learning Bayesian networks can be easily extended to probability estimation on this challenging structured space. Experiments on both synthetic and real data show that our methods greatly outperform the current practice of using the empirical distribution, as well as a previous effort for probability estimation on trees.

Zhang, Cheng, IV, Frederick A Matsen

A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. When used in conjunction with statistical techniques, the graphical model has several advantages for data analysis. One, because the model encodes dependencies among all variables, it readily handles situations where some data entries are missing. Two, a Bayesian network can be used to learn causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequences of intervention. Three, because the model has both a causal and probabilistic semantics, it is an ideal representation for combining prior knowledge (which often comes in causal form) and data. Four, Bayesian statistical methods in conjunction with Bayesian networks offer an efficient and principled approach for avoiding the overfitting of data. In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarize Bayesian statistical methods for using data to improve these models. With regard to the latter task, we describe methods for learning both the parameters and structure of a Bayesian network, including techniques for learning with incomplete data. In addition, we relate Bayesian-network methods for learning to techniques for supervised and unsupervised learning. We illustrate the graphical-modeling approach using a real-world case study.