Goto

Collaborating Authors

 Europe


Bayesian Causal Induction

arXiv.org Artificial Intelligence

Discovering causal relationships is a hard task, often hindered by the need for intervention, and often requiring large amounts of data to resolve statistical uncertainty. However, humans quickly arrive at useful causal relationships. One possible reason is that humans extrapolate from past experience to new, unseen situations: that is, they encode beliefs over causal invariances, allowing for sound generalization from the observations they obtain from directly acting in the world. Here we outline a Bayesian model of causal induction where beliefs over competing causal hypotheses are modeled using probability trees. Based on this model, we illustrate why, in the general case, we need interventions plus constraints on our causal hypotheses in order to extract causal information from our experience.


Unfounded Sets and Well-Founded Semantics of Answer Set Programs with Aggregates

Journal of Artificial Intelligence Research

Logic programs with aggregates (LPA) are one of the major linguistic extensions to Logic Programming (LP). In this work, we propose a generalization of the notions of unfounded set and well-founded semantics for programs with monotone and antimonotone aggregates (LPAma programs). In particular, we present a new notion of unfounded set for LPAma programs, which is a sound generalization of the original definition for standard (aggregate-free) LP. On this basis, we define a well-founded operator for LPAma programs, the fixpoint of which is called well-founded model (or well-founded semantics) for LPAma programs. The most important properties of unfounded sets and the well-founded semantics for standard LP are retained by this generalization, notably existence and uniqueness of the well-founded model, together with a strong relationship to the answer set semantics for LPAma programs. We show that one of the D-well-founded semantics, defined by Pelov, Denecker, and Bruynooghe for a broader class of aggregates using approximating operators, coincides with the well-founded model as defined in this work on LPAma programs. We also discuss some complexity issues, most importantly we give a formal proof of tractable computation of the well-founded model for LPA programs. Moreover, we prove that for general LPA programs, which may contain aggregates that are neither monotone nor antimonotone, deciding satisfaction of aggregate expressions with respect to partial interpretations is coNP-complete. As a consequence, a well-founded semantics for general LPA programs that allows for tractable computation is unlikely to exist, which justifies the restriction on LPAma programs. Finally, we present a prototype system extending DLV, which supports the well-founded semantics for LPAma programs, at the time of writing the only implemented system that does so. Experiments with this prototype show significant computational advantages of aggregate constructs over equivalent aggregate-free encodings.


Information, learning and falsification

arXiv.org Machine Learning

There are (at least) three approaches to quantifying information. The first, algorithmic information or Kolmogorov complexity, takes events as strings and, given a universal Turing machine, quantifies the information content of a string as the length of the shortest program producing it. The second, Shannon information, takes events as belonging to ensembles and quantifies the information resulting from observing the given event in terms of the number of alternate events that have been ruled out. The third, statistical learning theory, has introduced measures of capacity that control (in part) the expected risk of classifiers. These capacities quantify the expectations regarding future data that learning algorithms embed into classifiers. This note describes a new method of quantifying information, effective information, that links algorithmic information to Shannon information, and also links both to capacities arising in statistical learning theory. After introducing the measure, we show that it provides a non-universal analog of Kolmogorov complexity. We then apply it to derive basic capacities in statistical learning theory: empirical VC-entropy and empirical Rademacher complexity. A nice byproduct of our approach is an interpretation of the explanatory power of a learning algorithm in terms of the number of hypotheses it falsifies, counted in two different ways for the two capacities. We also discuss how effective information relates to information gain, Shannon and mutual information.


A kernel-based framework for learning graded relations from data

arXiv.org Machine Learning

Driven by a large number of potential applications in areas like bioinformatics, information retrieval and social network analysis, the problem setting of inferring relations between pairs of data objects has recently been investigated quite intensively in the machine learning community. To this end, current approaches typically consider datasets containing crisp relations, so that standard classification methods can be adopted. However, relations between objects like similarities and preferences are often expressed in a graded manner in real-world applications. A general kernel-based framework for learning relations from data is introduced here. It extends existing approaches because both crisp and graded relations are considered, and it unifies existing approaches because different types of graded relations can be modeled, including symmetric and reciprocal relations. This framework establishes important links between recent developments in fuzzy set theory and machine learning. Its usefulness is demonstrated through various experiments on synthetic and real-world data.


Graph based E-Government web service composition

arXiv.org Artificial Intelligence

Nowadays, e-government has emerged as a government policy to improve the quality and efficiency of public administrations. By exploiting the potential of new information and communication technologies, government agencies are providing a wide spectrum of online services. These services are composed of several web services that comply with well defined processes. One of the big challenges is the need to optimize the composition of the elementary web services. In this paper, we present a solution for optimizing the computation effort in web service composition. Our method is based on Graph Theory. We model the semantic relationship between the involved web services through a directed graph. Then, we compute all shortest paths using for the first time, an extended version of the Floyd-Warshall algorithm.


3D Model Retrieval Based on Semantic and Shape Indexes

arXiv.org Artificial Intelligence

The size of 3D models used on the web or stored in databases is becoming increasingly high. Then, an efficient method that allows users to find similar 3D objects for a given 3D model query has become necessary. Keywords and the geometry of a 3D model cannot meet the needs of users' retrieval because they do not include the semantic information. In this paper, a new method has been proposed to 3D models retrieval using semantic concepts combined with shape indexes. To obtain these concepts, we use the machine learning methods to label 3D models by k-means algorithm in measures and shape indexes space. Moreover, semantic concepts have been organized and represented by ontology language OWL and spatial relationships are used to disambiguate among models of similar appearance. The SPARQL query language has been used to question the information displayed in this language and to compute the similarity between two 3D models. We interpret our results using the Princeton Shape Benchmark Database and the results show the performance of the proposed new approach to retrieval 3D models. Keywords: 3D Model, 3D retrieval, measures, shape indexes, semantic, ontology


Additive Covariance Kernels for High-Dimensional Gaussian Process Modeling

arXiv.org Machine Learning

Gaussian process models -also called Kriging models- are often used as mathematical approximations of expensive experiments. However, the number of observation required for building an emulator becomes unrealistic when using classical covariance kernels when the dimension of input increases. In oder to get round the curse of dimensionality, a popular approach is to consider simplified models such as additive models. The ambition of the present work is to give an insight into covariance kernels that are well suited for building additive Kriging models and to describe some properties of the resulting models.


Fast, Linear Time, m-Adic Hierarchical Clustering for Search and Retrieval using the Baire Metric, with linkages to Generalized Ultrametrics, Hashing, Formal Concept Analysis, and Precision of Data Measurement

arXiv.org Machine Learning

In areas such as search, matching, retrieval and general data analysis, massive increase in data requires new methods that can cope well with the explosion in volume and dimensionality of the available data. In this work, the Baire metric, which is furthermore an ultrametric, is used to induce a hierarchy and in turn to support clustering, matching and other operations. Arising directly out of the Baire distance is an ultrametric tree, which also can be seen as a tree that hierarchically clusters data. This presents a number of advantages when storing and retrieving data. When the data source is in numerical form this ultrametric tree can be used as an index structure making matching and search, and thus retrieval, much easier. The clusters can be associated with hash keys, that is to say, the cluster members can be mapped onto "bins" or "buckets".


Pattern-Based Classification: A Unifying Perspective

arXiv.org Artificial Intelligence

The use of patterns in predictive models is a topic that has received a lot of attention in recent years. Pattern mining can help to obtain models for structured domains, such as graphs and sequences, and has been proposed as a means to obtain more accurate and more interpretable models. Despite the large amount of publications devoted to this topic, we believe however that an overview of what has been accomplished in this area is missing. This paper presents our perspective on this evolving area. We identify the principles of pattern mining that are important when mining patterns for models and provide an overview of pattern-based classification methods. We categorize these methods along the following dimensions: (1) whether they post-process a pre-computed set of patterns or iteratively execute pattern mining algorithms; (2) whether they select patterns model-independently or whether the pattern selection is guided by a model. We summarize the results that have been obtained for each of these methods.


Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization

Journal of Artificial Intelligence Research

Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations.