Goto

Collaborating Authors

 Genre


On-line Prediction with Kernels and the Complexity Approximation Principle

arXiv.org Machine Learning

The paper describes an application of Aggregating Algorithm to the problem of regression. It generalizes earlier results concerned with plain linear regression to kernel techniques and presents an on-line algorithm which performs nearly as well as any oblivious kernel predictor. The paper contains the derivation of an estimate on the performance of this algorithm. The estimate is then used to derive an application of the Complexity Approximation Principle to kernel methods.


Algebraic Statistics in Model Selection

arXiv.org Machine Learning

We develop the necessary theory in computational algebraic geometry to place Bayesian networks into the realm of algebraic statistics. We present an algebra{statistics dictionary focused on statistical modeling. In particular, we link the notion of effiective dimension of a Bayesian network with the notion of algebraic dimension of a variety. We also obtain the independence and non{independence constraints on the distributions over the observable variables implied by a Bayesian network with hidden variables, via a generating set of an ideal of polynomials associated to the network. These results extend previous work on the subject. Finally, the relevance of these results for model selection is discussed.


The Minimum Information Principle for Discriminative Learning

arXiv.org Machine Learning

Exponential models of distributions are widely used in machine learning for classiffication and modelling. It is well known that they can be interpreted as maximum entropy models under empirical expectation constraints. In this work, we argue that for classiffication tasks, mutual information is a more suitable information theoretic measure to be optimized. We show how the principle of minimum mutual information generalizes that of maximum entropy, and provides a comprehensive framework for building discriminative classiffiers. A game theoretic interpretation of our approach is then given, and several generalization bounds provided. We present iterative algorithms for solving the minimum information problem and its convex dual, and demonstrate their performance on various classiffication tasks. The results show that minimum information classiffiers outperform the corresponding maximum entropy models.


Conceptual Modelling and The Quality of Ontologies: Endurantism Vs. Perdurantism

arXiv.org Artificial Intelligence

Ontologies are key enablers for sharing precise and machine-understandable semantics among different applications and parties. Yet, for ontologies to meet these expectations, their quality must be of a good standard. The quality of an ontology is strongly based on the design method employed. This paper addresses the design problems related to the modelling of ontologies, with specific concentration on the issues related to the quality of the conceptualisations produced. The paper aims to demonstrate the impact of the modelling paradigm adopted on the quality of ontological models and, consequently, the potential impact that such a decision can have in relation to the development of software applications. To this aim, an ontology that is conceptualised based on the Object-Role Modelling (ORM) approach (a representative of endurantism) is re-engineered into a one modelled on the basis of the Object Paradigm (OP) (a representative of perdurantism). Next, the two ontologies are analytically compared using the specified criteria. The conducted comparison highlights that using the OP for ontology conceptualisation can provide more expressive, reusable, objective and temporal ontologies than those conceptualised on the basis of the ORM approach.


Novel Grey Interval Weight Determining and Hybrid Grey Interval Relation Method in Multiple Attribute Decision-Making

arXiv.org Artificial Intelligence

This paper proposes a grey interval relation TOPSIS for the decision making in which all of the attribute weights and attribute values are given by the interval grey numbers. The feature of our method different from other grey relation decision-making is that all of the subjective and objective weights are obtained by interval grey number and that decision-making is performed based on the relative approach degree of grey TOPSIS, the relative approach degree of grey incidence and the relative membership degree of grey incidence using 2-dimensional Euclidean distance. The weighted Borda method is used for combining the results of three methods. An example shows the applicability of the proposed approach.


On Modeling Profiles instead of Values

arXiv.org Artificial Intelligence

We consider the problem of estimating the distribution underlying an observed sample of data. Instead of maximum likelihood, which maximizes the probability of the ob served values, we propose a different estimate, the high-profile distribution, which maximizes the probability of the observed profile the number of symbols appearing any given number of times. We determine the high-profile distribution of several data samples, establish some of its general properties, and show that when the number of distinct symbols observed is small compared to the data size, the high-profile and maximum-likelihood distributions are roughly the same, but when the number of symbols is large, the distributions differ, and high-profile better explains the data.


Robust Probabilistic Inference in Distributed Systems

arXiv.org Artificial Intelligence

Probabilistic inference problems arise naturally in distributed systems such as sensor networks and teams of mobile robots. Inference algorithms that use message passing are a natural fit for distributed systems, but they must be robust to the failure situations that arise in real-world settings, such as unreliable communication and node failures. Unfortunately, the popular sum-product algorithm can yield very poor estimates in these settings because the nodes' beliefs before convergence can be arbitrarily different from the correct posteriors. In this paper, we present a new message passing algorithm for probabilistic inference which provides several crucial guarantees that the standard sum-product algorithm does not. Not only does it converge to the correct posteriors, but it is also guaranteed to yield a principled approximation at any point before convergence. In addition, the computational complexity of the message passing updates depends only upon the model, and is independent of the network topology of the distributed system. We demonstrate the approach with detailed experimental results on a distributed sensor calibration task using data from an actual sensor network deployment.


Evidence-invariant Sensitivity Bounds

arXiv.org Artificial Intelligence

The sensitivities revealed by a sensitivity analysis of a probabilistic network typically depend on the entered evidence. For a real-life network therefore, the analysis is performed a number of times, with different evidence. Although efficient algorithms for sensitivity analysis exist, a complete analysis is often infeasible because of the large range of possible combinations of observations. In this paper we present a method for studying sensitivities that are invariant to the evidence entered. Our method builds upon the idea of establishing bounds between which a parameter can be varied without ever inducing a change in the most likely value of a variable of interest.


A New Characterization of Probabilities in Bayesian Networks

arXiv.org Artificial Intelligence

We characterize probabilities in Bayesian networks in terms of algebraic expressions called quasi-probabilities. These are arrived at by casting Bayesian networks as noisy AND-OR-NOT networks, and viewing the subnetworks that lead to a node as arguments for or against a node. Quasi-probabilities are in a sense the "natural" algebra of Bayesian networks: we can easily compute the marginal quasi-probability of any node recursively, in a compact form; and we can obtain the joint quasi-probability of any set of nodes by multiplying their marginals (using an idempotent product operator). Quasi-probabilities are easily manipulated to improve the efficiency of probabilistic inference. They also turn out to be representable as square-wave pulse trains, and joint and marginal distributions can be computed by multiplication and complementation of pulse trains.


Predictive State Representations: A New Theory for Modeling Dynamical Systems

arXiv.org Artificial Intelligence

Modeling dynamical systems, both for control purposes and to make predictions about their behavior, is ubiquitous in science and engineering. Predictive state representations (PSRs) are a recently introduced class of models for discrete-time dynamical systems. The key idea behind PSRs and the closely related OOMs (Jaeger's observable operator models) is to represent the state of the system as a set of predictions of observable outcomes of experiments one can do in the system. This makes PSRs rather different from history-based models such as nth-order Markov models and hidden-state-based models such as HMMs and POMDPs. We introduce an interesting construct, the systemdynamics matrix, and show how PSRs can be derived simply from it. We also use this construct to show formally that PSRs are more general than both nth-order Markov models and HMMs/POMDPs. Finally, we discuss the main difference between PSRs and OOMs and conclude with directions for future work.