Goto

Collaborating Authors

 Uncertainty


When you talk about "Information processing" what actually do you have in mind?

arXiv.org Artificial Intelligence

"Information processing" is a not-so-long-ago launched buzzword that is extensively used in many research fields and communities. Despite of its widespread popularity, the real meaning of it is far less acknowledged and understood. Wikipedia [1] and Plato (Stanford Encyclopedia of Philosophy) [2] provide special entries for it, but even in the lightest manner, these entries do not confront the threatening ambiguity and incomprehensibility of this expression. Positing that "Information processing is the change (processing) of information "[1] in any way does not clarify its elusive essence. The reason for that is simple - the key component of the expression ("information") has never been defined and never determined, neither in the times of ancient philosophers nor in these glorious days, when "information era" has become our blossoming reality. It is worth to be mentioned - even today "information" does not have an accepted and a generally agreed definition. Far worse than that - it has always been (and continues to be) a "bone of contention" between many prominent thinkers, scholars and scientists. I do not intend to take part in this controversy. In the paper's Reference section I provide a list of some relevant publications addressing this issue, with only one and a definite purpose in mind - to give the vigilant readers a fair opportunity to verify by themselves how useful and applicable are the concepts of information that these leading thinkers and scholars are developing and advance (L.


Increasing Air Traffic: What is the Problem?

arXiv.org Artificial Intelligence

Nowadays, huge efforts are made to modernize the air traffic management systems to cope with uncertainty, complexity and sub-optimality. An answer is to enhance the information sharing between the stakeholders. This paper introduces a framework that bridges the gap between air traffic management and air traffic control on the one hand, and bridges the gap between the ground, the approach and the en-route centers on the other hand. An original system is presented, that has three essential components: the trajectory models, the optimization process, and the monitoring process. The uncertainty of the trajectory is modeled with a Bayesian Network, where the nodes are associated to two types of random variables: the time of overflight on metering points of the airspace, and the traveling time of the routes linking these points. The resulting Bayesian Network covers the complete airspace, and Monte- Carlo simulations are done to estimate the probabilities of sector congestion and delays. On top of this trajectory model, an optimization process minimizes these probabilities by tuning the parameters of the Bayesian trajectory model related to overflight times on metering points. The last component is the monitoring process, that continuously updates the situation of the airspace, modifying the trajectories uncertainties according to actual positions of aircraft. After each update, a new optimal set of overflight times is computed, and can be communicated to the controllers as clearances for the aircraft pilots. The paper presents a formal specification of this global optimization problem, whose underlying rationale was derived with the help of air traffic controllers at Thales Air Systems.


Matroidal structure of generalized rough sets based on symmetric and transitive relations

arXiv.org Artificial Intelligence

Rough sets are efficient for data pre-process in data mining. Lower and upper approximations are two core concepts of rough sets. This paper studies generalized rough sets based on symmetric and transitive relations from the operator-oriented view by matroidal approaches. We firstly construct a matroidal structure of generalized rough sets based on symmetric and transitive relations, and provide an approach to study the matroid induced by a symmetric and transitive relation. Secondly, this paper establishes a close relationship between matroids and generalized rough sets. Approximation quality and roughness of generalized rough sets can be computed by the circuit of matroid theory. At last, a symmetric and transitive relation can be constructed by a matroid with some special properties.


MAP Complexity Results and Approximation Methods

arXiv.org Artificial Intelligence

MAP is the problem of finding a most probable instantiation of a set of nvariables in a Bayesian network, given some evidence. MAP appears to be a significantly harder problem than the related problems of computing the probability of evidence Pr, or MPE a special case of MAP. Because of the complexity of MAP, and the lack of viable algorithms to approximate it,MAP computations are generally avoided by practitioners. This paper investigates the complexity of MAP. We show that MAP is complete for NP. We also provide negative complexity results for elimination based algorithms. It turns out that MAP remains hard even when MPE, and Pr are easy. We show that MAP is NPcomplete when the networks are restricted to polytrees, and even then can not be effectively approximated. Because there is no approximation algorithm with guaranteed results, we investigate best effort approximations. We introduce a generic MAP approximation framework. As one instantiation of it, we implement local search coupled with belief propagation BP to approximate MAP. We show how to extract approximate evidence retraction information from belief propagation which allows us to perform efficient local search. This allows MAP approximation even on networks that are too complex to even exactly solve the easier problems of computing Pr or MPE. Experimental results indicate that using BP and local search provides accurate MAP estimates in many cases.


On the Testable Implications of Causal Models with Hidden Variables

arXiv.org Artificial Intelligence

The validity OF a causal model can be tested ONLY IF the model imposes constraints ON the probability distribution that governs the generated data. IN the presence OF unmeasured variables, causal models may impose two types OF constraints : conditional independencies, AS READ through the d - separation criterion, AND functional constraints, FOR which no general criterion IS available.This paper offers a systematic way OF identifying functional constraints AND, thus, facilitates the task OF testing causal models AS well AS inferring such models FROM data.


Finding Optimal Bayesian Networks

arXiv.org Artificial Intelligence

In this paper, we derive optimality results for greedy Bayesian-network search algorithms that perform single-edge modifications at each step and use asymptotically consistent scoring criteria. Our results extend those of Meek (1997) and Chickering (2002), who demonstrate that in the limit of large datasets, if the generative distribution is perfect with respect to a DAG defined over the observable variables, such search algorithms will identify this optimal (i.e. We relax their assumption about the generative distribution, and assume only that this distribution satisfies the composition property over the observable variables, which is a more realistic assumption for real domains. Under this assumption, we guarantee that the search algorithms identify an inclusion-optimal model; that is, a model that (1) contains the generative distribution and (2) has no sub-model that contains this distribution. In addition, we show that the composition property is guaranteed to hold whenever the dependence relationships in the generative distribution can be characterized by paths between singleton elements in some generative graphical model (e.g. a DAG, a chain graph, or a Markov network) even when the generative model includes unobserved variables, and even when the observed data is subject to selection bias. Introduction The problem of learning Bayesian networks (a.k.a directed graphical models) from data has received much attention in the UAI community. A simple approach taken by many researchers, particularly those contributing experimental papers, is to apply--in conjunction with a scoring criterion--a greedy single-edge search algorithm to the space of Bayesian-network structures or to the space of equivalence classes of those structures. There are a number of important reasons for the popularity of this approach.


Continuation Methods for Mixing Heterogenous Sources

arXiv.org Machine Learning

A number of modern learning tasks involve estimation from heterogeneous information sources. This includes classification with labeled and unlabeled data as well as other problems with analogous structure such as competitive (game theoretic) problems. The associated estimation problems can be typically reduced to solving a set of fixed point equations (consistency conditions). We introduce a general method for combining a preferred information source with another in this setting by evolving continuous paths of fixed points at intermediate allocations. We explicitly identify critical points along the unique paths to either increase the stability of estimation or to ensure a significant departure from the initial source. The homotopy continuation approach is guaranteed to terminate at the second source, and involves no combinatorial effort. We illustrate the power of these ideas both in classification tasks with labeled and unlabeled data, as well as in the context of a competitive (min-max) formulation of DNA sequence motif discovery.


Expectation-Propogation for the Generative Aspect Model

arXiv.org Machine Learning

The generative aspect model is an extension of the multinomial model for text that allows word probabilities to vary stochastically across documents. Previous results with aspect models have been promising, but hindered by the computational difficulty of carrying out inference and learning. This paper demonstrates that the simple variational methods of Blei et al (2001) can lead to inaccurate inferences and biased learning for the generative aspect model. We develop an alternative approach that leads to higher accuracy at comparable cost. An extension of Expectation-Propagation is used for inference and then embedded in an EM algorithm for learning. Experimental results are presented for both synthetic and real data sets.


Coordinates: Probabilistic Forecasting of Presence and Availability

arXiv.org Artificial Intelligence

We present methods employed in COORDINATE, a prototype service that supports collaboration and communication by learning predictive models that provide forecasts of users' presence and availability. We describe how data is collected about user activity and proximity from multiple devices, in addition to analysis of the content of users' calendars, the time of day, and day of week. We review applications of presence forecasting embedded in the PRIORITIES application and then present details of the COORDINATE service that was informed by the earlier efforts.


Accelerating Inference: towards a full Language, Compiler and Hardware stack

arXiv.org Machine Learning

We introduce Dimple, a fully open-source API for probabilistic modeling. Dimple allows the user to specify probabilistic models in the form of graphical models, Bayesian networks, or factor graphs, and performs inference (by automatically deriving an inference engine from a variety of algorithms) on the model. Dimple also serves as a compiler for GP5, a hardware accelerator for inference.