Europe
Learning to Make Predictions In Partially Observable Environments Without a Generative Model
When faced with the problem of learning a model of a high-dimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined together to form a more complete, structured model. However, in partially observable (non-Markov) environments, standard model-learning methods learn generative models, i.e. models that provide a probability distribution over all possible futures (such as POMDPs). It is not straightforward to restrict such models to make only certain predictions, and doing so does not always simplify the learning problem. In this paper we present prediction profile models: non-generative partial models for partially observable systems that make only a given set of predictions, and are therefore far simpler than generative models in some cases. We formalize the problem of learning a prediction profile model as a transformation of the original model-learning problem, and show empirically that one can learn prediction profile models that make a small set of important predictions even in systems that are too complex for standard generative models.
Mining Biclusters of Similar Values with Triadic Concept Analysis
Kaytoue, Mehdi, Kuznetsov, Sergei O., Macko, Juraj, Meira, Wagner, Napoli, Amedeo
Biclustering numerical data became a popular data-mining task in the beginning of 2000's, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So called biclusters of similar values can be thought as maximal sub-tables with close values. Only few methods address a complete, correct and non redundant enumeration of such patterns, which is a well-known intractable problem, while no formal framework exists. In this paper, we introduce important links between biclustering and formal concept analysis. More specifically, we originally show that Triadic Concept Analysis (TCA), provides a nice mathematical framework for biclustering. Interestingly, existing algorithms of TCA, that usually apply on binary data, can be used (directly or with slight modifications) after a preprocessing step for extracting maximal biclusters of similar values.
Most Relevant Explanation in Bayesian Networks
A major inference task in Bayesian networks is explaining why some variables are observed in their particular states using a set of target variables. Existing methods for solving this problem often generate explanations that are either too simple (underspecified) or too complex (overspecified). In this paper, we introduce a method called Most Relevant Explanation (MRE) which finds a partial instantiation of the target variables that maximizes the generalized Bayes factor (GBF) as the best explanation for the given evidence. Our study shows that GBF has several theoretical properties that enable MRE to automatically identify the most relevant target variables in forming its explanation. In particular, conditional Bayes factor (CBF), defined as the GBF of a new explanation conditioned on an existing explanation, provides a soft measure on the degree of relevance of the variables in the new explanation in explaining the evidence given the existing explanation. As a result, MRE is able to automatically prune less relevant variables from its explanation. We also show that CBF is able to capture well the explaining-away phenomenon that is often represented in Bayesian networks. Moreover, we define two dominance relations between the candidate solutions and use the relations to generalize MRE to find a set of top explanations that is both diverse and representative. Case studies on several benchmark diagnostic Bayesian networks show that MRE is often able to find explanatory hypotheses that are not only precise but also concise.
On the stability of bootstrap estimators
Christmann, Andreas, Salibian-Barrera, Matias, Van Aelst, Stefan
It is shown that bootstrap approximations of an estimator which is based on a continuous operator from the set of Borel probability measures defined on a compact metric space into a complete separable metric space is stable in the sense of qualitative robustness. Support vector machines based on shifted loss functions are treated as special cases.
Estimation of scale functions to model heteroscedasticity by support vector machines
Hable, Robert, Christmann, Andreas
A main goal of regression is to derive statistical conclusions on the conditional distribution of the output variable Y given the input values x. Two of the most important characteristics of a single distribution are location and scale. Support vector machines (SVMs) are well established to estimate location functions like the conditional median or the conditional mean. We investigate the estimation of scale functions by SVMs when the conditional median is unknown, too. Estimation of scale functions is important e.g. to estimate the volatility in finance. We consider the median absolute deviation (MAD) and the interquantile range (IQR) as measures of scale. Our main result shows the consistency of MAD-type SVMs.
8-Valent Fuzzy Logic for Iris Recognition and Biometry
Popescu-Bodorin, N., Balas, V. E., Motoc, I. M.
This paper shows that maintaining logical consistency of an iris recognition system is a matter of finding a suitable partitioning of the input space in enrollable and unenrollable pairs by negotiating the user comfort and the safety of the biometric system. In other words, consistent enrollment is mandatory in order to preserve system consistency. A fuzzy 3-valued disambiguated model of iris recognition is proposed and analyzed in terms of completeness, consistency, user comfort and biometric safety. It is also shown here that the fuzzy 3-valued model of iris recognition is hosted by an 8-valued Boolean algebra of modulo 8 integers that represents the computational formalization in which a biometric system (a software agent) can achieve the artificial understanding of iris recognition in a logically consistent manner.
Diverse Consequences of Algorithmic Probability
We reminisce and discuss applications of algorithmic probability to a wide range of problems in artificial intelligence, philosophy and technological society. We propose that Solomonoff has effectively axiomatized the field of artificial intelligence, therefore establishing it as a rigorous scientific discipline. We also relate to our own work in incremental machine learning and philosophy of complexity.
Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity
Ribeiro, R., Martins de Matos, D.
In automatic summarization, centrality-as-relevance means that the most important content of an information source, or a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms, and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domain-independent. Thorough automatic evaluation shows that the method achieves state-of-the-art performance, both in written text, and automatically transcribed speech summarization, including when compared to considerably more complex approaches.
Extracting Topological Information from Spatial Constraint Databases
Wu, Shasha (Spring Arbor University) | Revesz, Peter (University of Nebraska - Lincoln)
This paper presents an efficient topology information extraction algorithm that is capable of extracting primary topological relations, such as, interior, boundary, and exterior from a single spatial or spatio-temporal object stored in a linear constraint database. Any non-spatial constraints will be preserved so that the input spatio-temporal object’s temporal constraints will not be sacrificed by the algorithm. Based on the three primary topological relations, more topological relations between regions, lines, and points can be defined in a constraint database for future spatial analysis.
Reformulating the Dual Graphs of CSPs to Improve the Performance of Relational Neighborhood Inverse Consistency
Woodward, Robert J. (University of Nebraska-Lincoln) | Karakashian, Shant (University of Nebraska-Lincoln) | Choueiry, Berthe Y. (University of Nebraska-Lincoln) | Bessiere, Christian (University of Montpellier)
Freuder and Elfe (1996) introduced Neighborhood Inverse Consistency (NIC) as a new local consistency property for binary Constraint Satisfaction Problems (CSPs). Two advantages of the algorithm for enforcing NIC is that it automatically adapts its filtering power to the local connectivity of the network and has insignificant space overhead. However, studies on binary CSPs have shown that enforcing NIC is not effective on sparse graphs and too costly on dense graphs. In (Woodward et al. 2011), we introduced an algorithm for enforcing Relational Neighborhood Inverse Consistency (RNIC), which is an extension of NIC to non-binary CSPs. In this paper, we discuss how we enhance the propagation effectiveness of our algorithm and reduce its computational cost by reformulating the dual graph of the CSP. For that purpose, we describe two reformulation techniques that modify the topology of the dual graph without affecting the solution set of the problem. We present the two reformulations and their combinations, and discuss their effects on the consistency property enforced by the algorithm. We also describe a selection policy that nicely ties together the various components of our approach in a consistent, adaptive framework. Finally, we show that our automated selection policy outperforms all approaches in a statistically significant manner.