arXiv.org Artificial Intelligence
Separating a Real-Life Nonlinear Image Mixture
When acquiring an image of a paper document, the image printed on the back page sometimes shows through. The mixture of the front- and back-page images thus obtained is markedly nonlinear, and thus constitutes a good real-life test case for nonlinear blind source separation. This paper addresses a difficult version of this problem, corresponding to the use of "onion skin" paper, which results in a relatively strong nonlinearity of the mixture, which becomes close to singular in the lighter regions of the images. The separation is achieved through the MISEP technique, which is an extension of the well known INFOMAX method. The separation results are assessed with objective quality measures. They show an improvement over the results obtained with linear separation, but have room for further improvement.
Strong Asymptotic Assertions for Discrete MDL in Regression and Classification
We study the properties of the MDL (or maximum penalized complexity) estimator for Regression and Classification, where the underlying model class is countable. We show in particular a finite bound on the Hellinger losses under the only assumption that there is a "true" model contained in the class. This implies almost sure convergence of the predictive distribution to the true one at a fast rate. It corresponds to Solomonoff's central theorem of universal induction, however with a bound that is exponentially larger.
An In-Depth Look at Information Fusion Rules & the Unification of Fusion Theories
This paper may look like a glossary of the fusion rules and we also introduce new ones presenting their formulas and examples: Conjunctive, Disjunctive, Exclusive Disjunctive, Mixed Conjunctive-Disjunctive rules, Conditional rule, Dempster's, Yager's, Smets' TBM rule, Dubois-Prade's, Dezert-Smarandache classical and hybrid rules, Murphy's average rule, Inagaki-Lefevre-Colot-Vannoorenberghe Unified Combination rules [and, as particular cases: Iganaki's parameterized rule, Weighting Average Operator, minC (M. Daniel), and newly Proportional Conflict Redistribution rules (Smarandache-Dezert) among which PCR5 is the most exact way of redistribution of the conflicting mass to non-empty sets following the path of the conjunctive rule], Zhang's Center Combination rule, Convolutive x-Averaging, Consensus Operator (Josang), Cautious Rule (Smets), ?-junctions rules (Smets), etc. and three new T-norm & T-conorm rules adjusted from fuzzy and neutrosophic sets to information fusion (Tchamova-Smarandache). Introducing the degree of union and degree of inclusion with respect to the cardinal of sets not with the fuzzy set point of view, besides that of intersection, many fusion rules can be improved. There are corner cases where each rule might have difficulties working or may not get an expected result.
Universal Convergence of Semimeasures on Individual Random Sequences
Hutter, Marcus, Muchnik, Andrej
Solomonoff's central result on induction is that the posterior of a universal semimeasure M converges rapidly and with probability 1 to the true sequence generating posterior mu, if the latter is computable. Hence, M is eligible as a universal sequence predictor in case of unknown mu. Despite some nearby results and proofs in the literature, the stronger result of convergence for all (Martin-Loef) random sequences remained open. Such a convergence result would be particularly interesting and natural, since randomness can be defined in terms of M itself. We show that there are universal semimeasures M which do not converge for all random sequences, i.e. we give a partial negative answer to the open problem. We also provide a positive answer for some non-universal semimeasures. We define the incomputable measure D as a mixture over all computable measures and the enumerable semimeasure W as a mixture over all enumerable nearly-measures. We show that W converges to D and D to mu on all random sequences. The Hellinger distance measuring closeness of two distributions plays a central role.
On the Convergence Speed of MDL Predictions for Bernoulli Sequences
We consider the Minimum Description Length principle for online sequence prediction. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is bounded, implying convergence with probability one, and (b) it additionally specifies a `rate of convergence'. Generally, for MDL only exponential loss bounds hold, as opposed to the linear bounds for a Bayes mixture. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. The results apply to many Machine Learning tasks including classification and hypothesis testing. We provide arguments that our theorems generalize to countable classes of i.i.d. models.
On the Complexity of Case-Based Planning
Case-based reasoning [23, 1, 32] is a problem solving methodology based on using a library of solutions for similar problems, i.e., a library of "cases" with their respective solutions. Roughly speaking, case-based planning consists into storing generated plans and using them for finding new plans [15, 8, 29]. In practice, what is stored is not only a specific problem with a specific solution, but also some additional information that is considered useful to the aim of solving new problems, e.g., information about how the plan has been derived [30], why it works [20, 16], when it would not work [17], etc. Different case-based planners differ on how they store cases, which cases they retrieve when the solution of a new problem is needed, how they adapt a solution to a new problem, whether they use one or more cases for building a
Semiclassical Neural Network
We have constructed a simple semiclassical model of neural network where neurons have quantum links with one another in a chosen way and affect one another in a fashion analogous to action potentials. We have examined the role of stochasticity introduced by the quantum potential and compare the system with the classical system of an integrate-and-fire model by Hopfield. Average periodicity and short term retentivity of input memory are noted.
Where Fail-Safe Default Logics Fail
Reiter's original definition of default logic allows for the application of a default that contradicts a previously applied one. We call failure this condition. The possibility of generating failures has been in the past considered as a semantical problem, and variants have been proposed to solve it. We show that it is instead a computational feature that is needed to encode some domains into default logic.
Distribution of Mutual Information from Complete and Incomplete Data
Hutter, Marcus, Zaffalon, Marco
Mutual information is widely used, in a descriptive way, to measure the stochastic dependence of categorical random variables. In order to address questions such as the reliability of the descriptive value, one must consider sample-to-population inferential approaches. This paper deals with the posterior distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean, and analytical approximations for the variance, skewness and kurtosis are derived. These approximations have a guaranteed accuracy level of the order O(1/n^3), where n is the sample size. Leading order approximations for the mean and the variance are derived in the case of incomplete samples. The derived analytical expressions allow the distribution of mutual information to be approximated reliably and quickly. In fact, the derived expressions can be computed with the same order of complexity needed for descriptive mutual information. This makes the distribution of mutual information become a concrete alternative to descriptive mutual information in many applications which would benefit from moving to the inductive side. Some of these prospective applications are discussed, and one of them, namely feature selection, is shown to perform significantly better when inductive mutual information is used.
Ensembles of Protein Molecules as Statistical Analog Computers
A class of analog computers built from large numbers of microscopic probabilistic machines is discussed. It is postulated that such computers are implemented in biological systems as ensembles of protein molecules. The formalism is based on an abstract computational model referred to as Protein Molecule Machine (PMM). A PMM is a continuous-time first-order Markov system with real input and output vectors, a finite set of discrete states, and the input-dependent conditional probability densities of state transitions. The output of a PMM is a function of its input and state. The components of input vector, called generalized potentials, can be interpreted as membrane potential, and concentrations of neurotransmitters. The components of output vector, called generalized currents, can represent ion currents, and the flows of second messengers. An Ensemble of PMMs (EPMM) is a set of independent identical PMMs with the same input vector, and the output vector equal to the sum of output vectors of individual PMMs. The paper suggests that biological neurons have much more sophisticated computational resources than the presently popular models of artificial neurons.