Goto

Collaborating Authors

 Country


Efficient Estimation of Multidimensional Regression Model with Multilayer Perceptron

arXiv.org Machine Learning

This work concerns estimation of multidimensional nonlinear regression models using multilayer perceptron (MLP). The main problem with such model is that we have to know the covariance matrix of the noise to get optimal estimator. however we show that, if we choose as cost function the logarithm of the determinant of the empirical error covariance matrix, we get an asymptotically optimal estimator.


Testing the number of parameters with multidimensional MLP

arXiv.org Machine Learning

This work concerns testing the number of parameters in one hidden layer multilayer perceptron (MLP). For this purpose we assume that we have identifiable models, up to a finite group of transformations on the weights, this is for example the case when the number of hidden units is know. In this framework, we show that we get a simple asymptotic distribution, if we use the logarithm of the determinant of the empirical error covariance matrix as cost function.


Classification Constrained Dimensionality Reduction

arXiv.org Machine Learning

Dimensionality reduction is a topic of recent interest. In this paper, we present the classification constrained dimensionality reduction (CCDR) algorithm to account for label information. The algorithm can account for multiple classes as well as the semi-supervised setting. We present an out-of-sample expressions for both labeled and unlabeled data. For unlabeled data, we introduce a method of embedding a new point as preprocessing to a classifier. For labeled data, we introduce a method that improves the embedding during the training phase using the out-of-sample extension. We investigate classification performance using the CCDR algorithm on hyper-spectral satellite imagery data. We demonstrate the performance gain for both local and global classifiers and demonstrate a 10% improvement of the $k$-nearest neighbors algorithm performance. We present a connection between intrinsic dimension estimation and the optimal embedding dimension obtained using the CCDR algorithm.


Design and Implementation of Aggregate Functions in the DLV System

arXiv.org Artificial Intelligence

Disjunctive Logic Programming (DLP) is a very expressive formalism: it allows for expressing every property of finite structures that is decidable in the complexity class SigmaP2 (= NP^NP). Despite this high expressiveness, there are some simple properties, often arising in real-world applications, which cannot be encoded in a simple and natural manner. Especially properties that require the use of arithmetic operators (like sum, times, or count) on a set or multiset of elements, which satisfy some conditions, cannot be naturally expressed in classic DLP. To overcome this deficiency, we extend DLP by aggregate functions in a conservative way. In particular, we avoid the introduction of constructs with disputed semantics, by requiring aggregates to be stratified. We formally define the semantics of the extended language (called DLP^A), and illustrate how it can be profitably used for representing knowledge. Furthermore, we analyze the computational complexity of DLP^A, showing that the addition of aggregates does not bring a higher cost in that respect. Finally, we provide an implementation of DLP^A in DLV -- a state-of-the-art DLP system -- and report on experiments which confirm the usefulness of the proposed extension also for the efficiency of computation.


Predicting relevant empty spots in social interaction

arXiv.org Artificial Intelligence

Received: August 15, 2007 c 2006 Springer Science Business Media, Inc. Abstract An empty spot refers to an empty hard-to-fill space which can be found in the records of the social interaction, and is the clue to the persons in the underlying social network who do not appear in the records. This contribution addresses a problem to predict relevant empty spots in social interaction. Homogeneous and inhomogeneous networks are studied as a model underlying the social interaction. A heuristic predictor function method is presented as a new method to address the problem. Simulation experiment is demonstrated over a homogeneous network. A test data set in the form of market baskets is generated from the simulated communication. Precision to predict the empty spots is calculated to demonstrate the performance of the presented method.


On the Expressiveness of Levesque's Normal Form

Journal of Artificial Intelligence Research

Levesque proposed a generalization of a database called a proper knowledge base (KB), which is equivalent to a possibly infinite consistent set of ground literals. In contrast to databases, proper KBs do not make the closed-world assumption and hence the entailment problem becomes undecidable. Levesque then proposed a limited but efficient inference method V for proper KBs, which is sound and, when the query is in a certain normal form, also logically complete. He conjectured that for every first-order query there is an equivalent one in normal form. In this note, we show that this conjecture is false. In fact, we show that any class of formulas for which V is complete must be strictly less expressive than full first-order logic. Moreover, in the propositional case it is very unlikely that a formula always has a polynomial-size normal form.


FINE: Fisher Information Non-parametric Embedding

arXiv.org Machine Learning

We consider the problems of clustering, classification, and visualization of high-dimensional data when no straightforward Euclidean representation exists. Typically, these tasks are performed by first reducing the high-dimensional data to some lower dimensional Euclidean space, as many manifold learning methods have been developed for this task. In many practical problems however, the assumption of a Euclidean manifold cannot be justified. In these cases, a more appropriate assumption would be that the data lies on a statistical manifold, or a manifold of probability density functions (PDFs). In this paper we propose using the properties of information geometry in order to define similarities between data sets using the Fisher information metric. We will show this metric can be approximated using entirely non-parametric methods, as the parameterization of the manifold is generally unknown. Furthermore, by using multi-dimensional scaling methods, we are able to embed the corresponding PDFs into a low-dimensional Euclidean space. This not only allows for classification of the data, but also visualization of the manifold. As a whole, we refer to our framework as Fisher Information Non-parametric Embedding (FINE), and illustrate its uses on a variety of practical problems, including bio-medical applications and document classification.


New Implementation Framework for Saturation-Based Reasoning

arXiv.org Artificial Intelligence

The saturation-based reasoning methods are among the most theoretically developed ones and are used by most of the state-of-the-art first-order logic reasoners. In the last decade there was a sharp increase in performance of such systems, which I attribute to the use of advanced calculi and the intensified research in implementation techniques. However, nowadays we are witnessing a slowdown in performance progress, which may be considered as a sign that the saturation-based technology is reaching its inherent limits. The position I am trying to put forward in this paper is that such scepticism is premature and a sharp improvement in performance may potentially be reached by adopting new architectural principles for saturation. The top-level algorithms and corresponding designs used in the state-of-the-art saturation-based theorem provers have (at least) two inherent drawbacks: the insufficient flexibility of the used inference selection mechanisms and the lack of means for intelligent prioritising of search directions. In this position paper I analyse these drawbacks and present two ideas on how they could be overcome. In particular, I propose a flexible low-cost high-precision mechanism for inference selection, intended to overcome problems associated with the currently used instances of clause selection-based procedures. I also outline a method for intelligent prioritising of search directions, based on probing the search space by exploring generalised search directions. I discuss some technical issues related to implementation of the proposed architectural principles and outline possible solutions.


Perspective alignment in spatial language

arXiv.org Artificial Intelligence

It is well known that perspective alignment plays a major role in the planning and interpretation of spatial language. In order to understand the role of perspective alignment and the cognitive processes involved, we have made precise complete cognitive models of situated embodied agents that self-organise a communication system for dialoging about the position and movement of real world objects in their immediate surroundings. We show in a series of robotic experiments which cognitive mechanisms are necessary and sufficient to achieve successful spatial language and why and how perspective alignment can take place, either implicitly or based on explicit marking.


On the $\ell_1-\ell_q$ Regularized Regression

arXiv.org Machine Learning

In this paper we consider the problem of grouped variable selection in high-dimensional regression using $\ell_1-\ell_q$ regularization ($1\leq q \leq \infty$), which can be viewed as a natural generalization of the $\ell_1-\ell_2$ regularization (the group Lasso). The key condition is that the dimensionality $p_n$ can increase much faster than the sample size $n$, i.e. $p_n \gg n$ (in our case $p_n$ is the number of groups), but the number of relevant groups is small. The main conclusion is that many good properties from $\ell_1-$regularization (Lasso) naturally carry on to the $\ell_1-\ell_q$ cases ($1 \leq q \leq \infty$), even if the number of variables within each group also increases with the sample size. With fixed design, we show that the whole family of estimators are both estimation consistent and variable selection consistent under different conditions. We also show the persistency result with random design under a much weaker condition. These results provide a unified treatment for the whole family of estimators ranging from $q=1$ (Lasso) to $q=\infty$ (iCAP), with $q=2$ (group Lasso)as a special case. When there is no group structure available, all the analysis reduces to the current results of the Lasso estimator ($q=1$).