Plotting

 arXiv.org Artificial Intelligence


Sequence Prediction based on Monotone Complexity

arXiv.org Artificial Intelligence

This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=-log m, i.e. based on universal deterministic/one-part MDL. m is extremely close to Solomonoff's prior M, the latter being an excellent predictor in deterministic as well as probabilistic environments, where performance is measured in terms of convergence of posteriors or losses. Despite this closeness to M, it is difficult to assess the prediction quality of m, since little is known about the closeness of their posteriors, which are the important quantities for prediction. We show that for deterministic computable environments, the "posterior" and losses of m converge, but rapid convergence could only be shown on-sequence; the off-sequence behavior is unclear. In probabilistic environments, neither the posterior nor the losses converge, in general.


Redundancy in Logic I: CNF Propositional Formulae

arXiv.org Artificial Intelligence

A knowledge base is redundant if it contains parts that can be inferred from the rest of it. We study the problem of checking whether a CNF formula (a set of clauses) is redundant, that is, it contains clauses that can be derived from the other ones. Any CNF formula can be made irredundant by deleting some of its clauses: what results is an irredundant equivalent subset (I.E.S.) We study the complexity of some related problems: verification, checking existence of a I.E.S. with a given size, checking necessary and possible presence of clauses in I.E.S.'s, and uniqueness. We also consider the problem of redundancy with different definitions of equivalence.


Compilability of Abduction

arXiv.org Artificial Intelligence

Abduction is one of the most important forms of reasoning; it has been successfully applied to several practical problems such as diagnosis. In this paper we investigate whether the computational complexity of abduction can be reduced by an appropriate use of preprocessing. This is motivated by the fact that part of the data of the problem (namely, the set of all possible assumptions and the theory relating assumptions and manifestations) are often known before the rest of the problem. In this paper, we show some complexity results about abduction when compilation is allowed.


Robust Feature Selection by Mutual Information Distributions

arXiv.org Artificial Intelligence

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.


Distribution of Mutual Information

arXiv.org Artificial Intelligence

The mutual information of two random variables i and j with joint probabilities t_ij is commonly used in learning Bayesian nets as well as in many other fields. The chances t_ij are usually estimated by the empirical sampling frequency n_ij/n leading to a point estimate I(n_ij/n) for the mutual information. To answer questions like "is I(n_ij/n) consistent with zero?" or "what is the probability that the true mutual information is much larger than the point estimate?" one has to go beyond the point estimate. In the Bayesian framework one can answer these questions by utilizing a (second order) prior distribution p(t) comprising prior information about t. From the prior p(t) one can compute the posterior p(t|n), from which the distribution p(I|n) of the mutual information can be calculated. We derive reliable and quickly computable approximations for p(I|n). We concentrate on the mean, variance, skewness, and kurtosis, and non-informative priors. For the mean we also give an exact expression. Numerical issues and the range of validity are discussed.


Non-convex cost functionals in boosting algorithms and methods for panel selection

arXiv.org Artificial Intelligence

In this document we propose a new improvement for boosting techniques as proposed in Friedman '99 by the use of non-convex cost functional. The idea is to introduce a correlation term to better deal with forecasting of additive time series. The problem is discussed in a theoretical way to prove the existence of minimizing sequence, and in a numerical way to propose a new "ArgMin" algorithm. The model has been used to perform the touristic presence forecast for the winter season 1999/2000 in Trentino (italian Alps).


Towards a Universal Theory of Artificial Intelligence based on Algorithmic Probability and Sequential Decision Theory

arXiv.org Artificial Intelligence

Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown distribution. We unify both theories and give strong arguments that the resulting universal AIXI model behaves optimal in any computable environment. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXI^tl, which is still superior to any other time t and space l bounded agent. The computation time of AIXI^tl is of the order t x 2^l.


A Theory of Universal Artificial Intelligence based on Algorithmic Complexity

arXiv.org Artificial Intelligence

Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameterless theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible. We outline for a number of problem classes, including sequence prediction, strategic games, function minimization, reinforcement and supervised learning, how the AIXI model can formally solve them. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXI-tl, which is still effectively more intelligent than any other time t and space l bounded agent. The computation time of AIXI-tl is of the order tx2^l. Other discussed topics are formal definitions of intelligence order relations, the horizon problem and relations of the AIXI theory to other AI approaches.


Representation results for defeasible logic

arXiv.org Artificial Intelligence

Normal forms play an important role in computer science. Examples of areas where normal forms have proved fruitful include logic, where normal forms of formulae are used both for the proof of theoretical results and in automated theorem proving, and relational databases [7], where normal forms have been the driving force in the development of database theory and principles of good data modelling. In computer science, usually normal forms are supported by transformations, operational procedures that transform initial objects (such as programs or logical theories) to their normal form. Such transformations are important for two main reasons: 1. They support the understanding and assimilation of new concepts because they allow one to concentrate on certain forms and key features only. Thus transformations can be useful as theoretical tools.


Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity

arXiv.org Artificial Intelligence

The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's minimal sufficient statistic. In general we show that data compression is almost always the best strategy, both in hypothesis identification and prediction.