Goto

Collaborating Authors

 Country


Adaptive Gaussian Predictive Process Approximation

arXiv.org Machine Learning

We address the issue of knots selection for Gaussian predictive process methodology. Predictive process approximation provides an effective solution to the cubic order computational complexity of Gaussian process models. This approximation crucially depends on a set of points, called knots, at which the original process is retained, while the rest is approximated via a deterministic extrapolation. Knots should be few in number to keep the computational complexity low, but provide a good coverage of the process domain to limit approximation error. We present theoretical calculations to show that coverage must be judged by the canonical metric of the Gaussian process. This necessitates having in place a knots selection algorithm that automatically adapts to the changes in the canonical metric affected by changes in the parameter values controlling the Gaussian process covariance function. We present an algorithm toward this by employing an incomplete Cholesky factorization with pivoting and dynamic stopping. Although these concepts already exist in the literature, our contribution lies in unifying them into a fast algorithm and in using computable error bounds to finesse implementation of the predictive process approximation. The resulting adaptive predictive process offers a substantial automatization of Guassian process model fitting, especially for Bayesian applications where thousands of values of the covariance parameters are to be explored.


Reasoning in the OWL 2 Full Ontology Language using First-Order Automated Theorem Proving

arXiv.org Artificial Intelligence

OWL 2 has been standardized by the World Wide Web Consortium (W3C) as a family of ontology languages for the Semantic Web. The most expressive of these languages is OWL 2 Full, but to date no reasoner has been implemented for this language. Consistency and entailment checking are known to be undecidable for OWL 2 Full. We have translated a large fragment of the OWL 2 Full semantics into first-order logic, and used automated theorem proving systems to do reasoning based on this theory. The results are promising, and indicate that this approach can be applied in practice for effective OWL reasoning, beyond the capabilities of current Semantic Web reasoners. This is an extended version of a paper with the same title that has been published at CADE 2011, LNAI 6803, pp. 446-460. The extended version provides appendices with additional resources that were used in the reported evaluation.


Efficient Multi-Start Strategies for Local Search Algorithms

Journal of Artificial Intelligence Research

Local search algorithms applied to optimization problems often suffer from getting trapped in a local optimum. The common solution for this deficiency is to restart the algorithm when no progress is observed. Alternatively, one can start multiple instances of a local search algorithm, and allocate computational resources (in particular, processing time) to the instances depending on their behavior. Hence, a multi-start strategy has to decide (dynamically) when to allocate additional resources to a particular instance and when to start new instances. In this paper we propose multi-start strategies motivated by works on multi-armed bandit problems and Lipschitz optimization with an unknown constant. The strategies continuously estimate the potential performance of each algorithm instance by supposing a convergence rate of the local search algorithm up to an unknown constant, and in every phase allocate resources to those instances that could converge to the optimum for a particular range of the constant. Asymptotic bounds are given on the performance of the strategies. In particular, we prove that at most a quadratic increase in the number of times the target function is evaluated is needed to achieve the performance of a local search algorithm started from the attraction region of the optimum. Experiments are provided using SPSA (Simultaneous Perturbation Stochastic Approximation) and k-means as local search algorithms, and the results indicate that the proposed strategies work well in practice, and, in all cases studied, need only logarithmically more evaluations of the target function as opposed to the theoretically suggested quadratic increase.


Policy Invariance under Reward Transformations for General-Sum Stochastic Games

Journal of Artificial Intelligence Research

We extend the potential-based shaping method from Markov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game.


The Opposite of Smoothing: A Language Model Approach to Ranking Query-Specific Document Clusters

Journal of Artificial Intelligence Research

Exploiting information induced from (query-specific) clustering of top-retrieved documents has long been proposed as a means for improving precision at the very top ranks of the returned results. We present a novel language model approach to ranking query-specific clusters by the presumed percentage of relevant documents that they contain. While most previous cluster ranking approaches focus on the cluster as a whole, our model utilizes also information induced from documents associated with the cluster. Our model substantially outperforms previous approaches for identifying clusters containing a high relevant-document percentage. Furthermore, using the model to produce document ranking yields precision-at-top-ranks performance that is consistently better than that of the initial ranking upon which clustering is performed. The performance also favorably compares with that of a state-of-the-art pseudo-feedback-based retrieval method.


Technical Note: Towards ROC Curves in Cost Space

arXiv.org Artificial Intelligence

ROC curves and cost curves are two popular ways of visualising classifier performance, finding appropriate thresholds according to the operating condition, and deriving useful aggregated measures such as the area under the ROC curve (AUC) or the area under the optimal cost curve. In this note we present some new findings and connections between ROC space and cost space, by using the expected loss over a range of operating conditions. In particular, we show that ROC curves can be transferred to cost space by means of a very natural way of understanding how thresholds should be chosen, by selecting the threshold such that the proportion of positive predictions equals the operating condition (either in the form of cost proportion or skew). We call these new curves {ROC Cost Curves}, and we demonstrate that the expected loss as measured by the area under these curves is linearly related to AUC. This opens up a series of new possibilities and clarifies the notion of cost curve and its relation to ROC analysis. In addition, we show that for a classifier that assigns the scores in an evenly-spaced way, these curves are equal to the Brier Curves. As a result, this establishes the first clear connection between AUC and the Brier score.


Information, Utility & Bounded Rationality

arXiv.org Artificial Intelligence

Perfectly rational decision-makers maximize expected utility, but crucially ignore the resource costs incurred when determining optimal actions. Here we propose an axiomatic framework for bounded rational decision-making based on a thermodynamic interpretation of resource costs as information costs. We show that this axiomatic framework enforces a unique conversion law between utility and information, which can be characterized by a variational "free utility" principle akin to thermodynamical free energy. This variational principle constitutes a normative criterion that trades off utility and information costs, the latter measured by the Kullback-Leibler deviation between a distribution representing a desired policy and a reference distribution representing an initial default policy. We show that bounded optimal control solutions can be derived from this variational principle, which leads in general to stochastic policies. Furthermore, we show that risk-sensitive and robust (minimax) control schemes fall out naturally from this framework if the environment is considered as an adversarial opponent. When resource costs are ignored, the maximum expected utility principle is recovered.


Sharp Convergence Rate and Support Consistency of Multiple Kernel Learning with Sparse and Dense Regularization

arXiv.org Machine Learning

We theoretically investigate the convergence rate and support consistency (i.e., correctly identifying the subset of non-zero coefficients in the large sample limit) of multiple kernel learning (MKL). We focus on MKL with block-l1 regularization (inducing sparse kernel combination), block-l2 regularization (inducing uniform kernel combination), and elastic-net regularization (including both block-l1 and block-l2 regularization). For the case where the true kernel combination is sparse, we show a sharper convergence rate of the block-l1 and elastic-net MKL methods than the existing rate for block-l1 MKL. We further show that elastic-net MKL requires a milder condition for being consistent than block-l1 MKL. For the case where the optimal kernel combination is not exactly sparse, we prove that elastic-net MKL can achieve a faster convergence rate than the block-l1 and block-l2 MKL methods by carefully controlling the balance between the block-l1and block-l2 regularizers. Thus, our theoretical results overall suggest the use of elastic-net regularization in MKL.


Nonparametric Bayesian sparse factor models with application to gene expression modeling

arXiv.org Artificial Intelligence

A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data $\mathbf{Y}$ is modeled as a linear superposition, $\mathbf{G}$, of a potentially infinite number of hidden factors, $\mathbf{X}$. The Indian Buffet Process (IBP) is used as a prior on $\mathbf{G}$ to incorporate sparsity and to allow the number of latent features to be inferred. The model's utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.


On the Undecidability of Fuzzy Description Logics with GCIs with Lukasiewicz t-norm

arXiv.org Artificial Intelligence

Recently there have been some unexpected results concerning Fuzzy Description Logics (FDLs) with General Concept Inclusions (GCIs). They show that, unlike the classical case, the DL ALC with GCIs does not have the finite model property under Lukasiewicz Logic or Product Logic and, specifically, knowledge base satisfiability is an undecidable problem for Product Logic. We complete here the analysis by showing that knowledge base satisfiability is also an undecidable problem for Lukasiewicz Logic.