Genre
Probability Bracket Notation: Markov State Chain Projector, Hidden Markov Models and Dynamic Bayesian Networks
The Weather-Stone Example and the Elvira Software page 21 5. VMM, HMM and FHMM as Dynamic Bayesian Networks page 23 Summary page 26 References page 27 Abstract After a brief discussion of Markov Evolution Formula (MEF) expressed in Probability Bracket Notation (PBN), its close relation with the joint probability distribution (JPD) of Visible Markov Models (VMM) is demonstrated by introducing Markov State Chain Projector (MSCP). The state basis and the observed basis are defined in the Sequential Event Space (SES) of Hidden Markov Models (HMM). The JPD of HMM is derived by using basis transformation in SES. The Viterbi algorithm is revisited and applied to the famous Weather HMM example, whose node graph and inference results are displayed by using software package Elvira. In the end, the formulas of VMM, HMM and some factorial HMM (FHMM) are expressed in PBN as instances of dynamic Bayesian Networks (DBN). Dr. Xing M Wang PBN, Markov Time Evolution & HMM Page 1 of 27 2012-12-16 1. Introduction: PBN and Discrete Markov Chain Inspired by the great success of Dirac notation, we have proposed Probability Bracket Notation (PBN) [1], where we have used PBN to discuss Markov chains (see [2] Chap.11). Based on our main topic of this article, we will concentrate on homogeneous, time-discrete first-order Markov chains with finite discrete states.
Matroidal structure of generalized rough sets based on symmetric and transitive relations
Rough sets are efficient for data pre-process in data mining. Lower and upper approximations are two core concepts of rough sets. This paper studies generalized rough sets based on symmetric and transitive relations from the operator-oriented view by matroidal approaches. We firstly construct a matroidal structure of generalized rough sets based on symmetric and transitive relations, and provide an approach to study the matroid induced by a symmetric and transitive relation. Secondly, this paper establishes a close relationship between matroids and generalized rough sets. Approximation quality and roughness of generalized rough sets can be computed by the circuit of matroid theory. At last, a symmetric and transitive relation can be constructed by a matroid with some special properties.
Theorem Proving in Large Formal Mathematics as an Emerging AI Field
In the recent years, we have linked a large corpus of formal mathematics with automated theorem proving (ATP) tools, and started to develop combined AI/ATP systems working in this setting. In this paper we first relate this project to the earlier large-scale automated developments done by Quaife with McCune's Otter system, and to the discussions of the QED project about formalizing a significant part of mathematics. Then we summarize our adventure so far, argue that the QED dreams were right in anticipating the creation of a very interesting semantic AI field, and discuss its further research directions.
Equivalence of History and Generator Epsilon-Machines
Travers, Nicholas F., Crutchfield, James P.
Epsilon-machines are minimal, unifilar presentations of stationary stochastic processes. They were originally defined in the history machine sense, as hidden Markov models whose states are the equivalence classes of infinite pasts with the same probability distribution over futures. In analyzing synchronization, though, an alternative generator definition was given: unifilar, edge-emitting hidden Markov models with probabilistically distinct states. The key difference is that history epsilon-machines are defined by a process, whereas generator epsilon-machines define a process. We show here that these two definitions are equivalent in the finite-state case.
MAP Complexity Results and Approximation Methods
MAP is the problem of finding a most probable instantiation of a set of nvariables in a Bayesian network, given some evidence. MAP appears to be a significantly harder problem than the related problems of computing the probability of evidence Pr, or MPE a special case of MAP. Because of the complexity of MAP, and the lack of viable algorithms to approximate it,MAP computations are generally avoided by practitioners. This paper investigates the complexity of MAP. We show that MAP is complete for NP. We also provide negative complexity results for elimination based algorithms. It turns out that MAP remains hard even when MPE, and Pr are easy. We show that MAP is NPcomplete when the networks are restricted to polytrees, and even then can not be effectively approximated. Because there is no approximation algorithm with guaranteed results, we investigate best effort approximations. We introduce a generic MAP approximation framework. As one instantiation of it, we implement local search coupled with belief propagation BP to approximate MAP. We show how to extract approximate evidence retraction information from belief propagation which allows us to perform efficient local search. This allows MAP approximation even on networks that are too complex to even exactly solve the easier problems of computing Pr or MPE. Experimental results indicate that using BP and local search provides accurate MAP estimates in many cases.
On the Testable Implications of Causal Models with Hidden Variables
The validity OF a causal model can be tested ONLY IF the model imposes constraints ON the probability distribution that governs the generated data. IN the presence OF unmeasured variables, causal models may impose two types OF constraints : conditional independencies, AS READ through the d - separation criterion, AND functional constraints, FOR which no general criterion IS available.This paper offers a systematic way OF identifying functional constraints AND, thus, facilitates the task OF testing causal models AS well AS inferring such models FROM data.
Finding Optimal Bayesian Networks
Chickering, David Maxwell, Meek, Christopher
In this paper, we derive optimality results for greedy Bayesian-network search algorithms that perform single-edge modifications at each step and use asymptotically consistent scoring criteria. Our results extend those of Meek (1997) and Chickering (2002), who demonstrate that in the limit of large datasets, if the generative distribution is perfect with respect to a DAG defined over the observable variables, such search algorithms will identify this optimal (i.e. We relax their assumption about the generative distribution, and assume only that this distribution satisfies the composition property over the observable variables, which is a more realistic assumption for real domains. Under this assumption, we guarantee that the search algorithms identify an inclusion-optimal model; that is, a model that (1) contains the generative distribution and (2) has no sub-model that contains this distribution. In addition, we show that the composition property is guaranteed to hold whenever the dependence relationships in the generative distribution can be characterized by paths between singleton elements in some generative graphical model (e.g. a DAG, a chain graph, or a Markov network) even when the generative model includes unobserved variables, and even when the observed data is subject to selection bias. Introduction The problem of learning Bayesian networks (a.k.a directed graphical models) from data has received much attention in the UAI community. A simple approach taken by many researchers, particularly those contributing experimental papers, is to apply--in conjunction with a scoring criterion--a greedy single-edge search algorithm to the space of Bayesian-network structures or to the space of equivalence classes of those structures. There are a number of important reasons for the popularity of this approach.
Answer Set Solving in Practice
Gebser, Martin, Kaminski, Roland, Kaufmann, Benjamin, Schaub, Torsten
Answer Set Programming (ASP) is a declarative problem solving approach, initially tailored to modeling problems in the area of Knowledge Representation and Reasoning (KRR). More recently, its attractive combination of a rich yet simple modeling language with high-performance solving capacities has sparked interest in many other areas even beyond KRR. This book presents a practical introduction to ASP, aiming at using ASP languages and systems for solving application problems. Starting from the essential formal foundations, it introduces ASP's solving technology, modeling language and methodology, while illustrating the overall solving process by practical examples.
Deviation optimal learning using greedy Q-aggregation
Dai, Dong, Rigollet, Philippe, Zhang, Tong
Given a finite family of functions, the goal of model selection aggregation is to construct a procedure that mimics the function from this family that is the closest to an unknown regression function. More precisely, we consider a general regression model with fixed design and measure the distance between functions by the mean squared error at the design points. While procedures based on exponential weights are known to solve the problem of model selection aggregation in expectation, they are, surprisingly, sub-optimal in deviation. We propose a new formulation called Q-aggregation that addresses this limitation; namely, its solution leads to sharp oracle inequalities that are optimal in a minimax sense. Moreover, based on the new formulation, we design greedy Q-aggregation procedures that produce sparse aggregation models achieving the optimal rate. The convergence and performance of these greedy procedures are illustrated and compared with other standard methods on simulated examples.
Continuation Methods for Mixing Heterogenous Sources
Corduneanu, Adrian, Jaakkola, Tommi S.
A number of modern learning tasks involve estimation from heterogeneous information sources. This includes classification with labeled and unlabeled data as well as other problems with analogous structure such as competitive (game theoretic) problems. The associated estimation problems can be typically reduced to solving a set of fixed point equations (consistency conditions). We introduce a general method for combining a preferred information source with another in this setting by evolving continuous paths of fixed points at intermediate allocations. We explicitly identify critical points along the unique paths to either increase the stability of estimation or to ensure a significant departure from the initial source. The homotopy continuation approach is guaranteed to terminate at the second source, and involves no combinatorial effort. We illustrate the power of these ideas both in classification tasks with labeled and unlabeled data, as well as in the context of a competitive (min-max) formulation of DNA sequence motif discovery.