Goto

Collaborating Authors

 Genre


Deceptiveness and Neutrality - the ND family of fitness landscapes

arXiv.org Artificial Intelligence

When a considerable number of mutations have no effects on fitness values, the fitness landscape is said neutral. In order to study the interplay between neutrality, which exists in many real-world applications, and performances of metaheuristics, it is useful to design landscapes which make it possible to tune precisely neutral degree distribution. Even though many neutral landscape models have already been designed, none of them are general enough to create landscapes with specific neutral degree distributions. We propose three steps to design such landscapes: first using an algorithm we construct a landscape whose distribution roughly fits the target one, then we use a simulated annealing heuristic to bring closer the two distributions and finally we affect fitness values to each neutral network. Then using this new family of fitness landscapes we are able to highlight the interplay between deceptiveness and neutrality.


Resource Adaptive Agents in Interactive Theorem Proving

arXiv.org Artificial Intelligence

We introduce a resource adaptive agent mechanism which supports the user in interactive theorem proving. The mechanism uses a two layered architecture of agent societies to suggest appropriate commands together with possible command argument instantiations. Experiments with this approach show that its effectiveness can be further improved by introducing a resource concept. In this paper we provide an abstract view on the overall mechanism, motivate the necessity of an appropriate resource concept and discuss its realization within the agent architecture.


Learning Low-Density Separators

arXiv.org Artificial Intelligence

We define a novel, basic, unsupervised learning problem - learning the lowest density homogeneous hyperplane separator of an unknown probability distribution. This task is relevant to several problems in machine learning, such as semi-supervised learning and clustering stability. We investigate the question of existence of a universally consistent algorithm for this problem. We propose two natural learning paradigms and prove that, on input unlabeled random samples generated by any member of a rich family of distributions, they are guaranteed to converge to the optimal separator for that distribution. We complement this result by showing that no learning algorithm for our task can achieve uniform learning rates (that are independent of the data generating distribution).


Model-Consistent Sparse Estimation through the Bootstrap

arXiv.org Machine Learning

We consider the least-square linear regression problem with regularization by the $\ell^1$-norm, a problem usually referred to as the Lasso. In this paper, we first present a detailed asymptotic analysis of model consistency of the Lasso in low-dimensional settings. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection. For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. We show that this property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection procedure, referred to as the Bolasso, is extended to high-dimensional settings by a provably consistent two-step procedure.


On finitely recursive programs

arXiv.org Artificial Intelligence

Disjunctive finitary programs are a class of logic programs admitting function symbols and hence infinite domains. They have very good computational properties, for example ground queries are decidable while in the general case the stable model semantics is highly undecidable. In this paper we prove that a larger class of programs, called finitely recursive programs, preserves most of the good properties of finitary programs under the stable model semantics, namely: (i) finitely recursive programs enjoy a compactness property; (ii) inconsistency checking and skeptical reasoning are semidecidable; (iii) skeptical resolution is complete for normal finitely recursive programs. Moreover, we show how to check inconsistency and answer skeptical queries using finite subsets of the ground program instantiation. We achieve this by extending the splitting sequence theorem by Lifschitz and Turner: We prove that if the input program P is finitely recursive, then the partial stable models determined by any smooth splitting omega-sequence converge to a stable model of P.


State Space Realization Theorems For Data Mining

arXiv.org Machine Learning

In this paper, we consider formal series associated with events, profiles derived from events, and statistical models that make predictions about events. We prove theorems about realizations for these formal series using the language and tools of Hopf algebras.


Sparse Causal Discovery in Multivariate Time Series

arXiv.org Machine Learning

Our goal is to estimate causal interactions in multivariate time series. Using vector autoregressive (VAR) models, these can be defined based on non-vanishing coefficients belonging to respective time-lagged instances. As in most cases a parsimonious causality structure is assumed, a promising approach to causal discovery consists in fitting VAR models with an additional sparsity-promoting regularization. Along this line we here propose that sparsity should be enforced for the subgroups of coefficients that belong to each pair of time series, as the absence of a causal relation requires the coefficients for all time-lags to become jointly zero. Such behavior can be achieved by means of l1-l2-norm regularized regression, for which an efficient active set solver has been proposed recently. Our method is shown to outperform standard methods in recovering simulated causality graphs. The results are on par with a second novel approach which uses multiple statistical testing.


On Introspection, Metacognitive Control and Augmented Data Mining Live Cycles

arXiv.org Artificial Intelligence

We discuss metacognitive modelling as an enhancement to cognitive modelling and computing. Metacognitive control mechanisms should enable AI systems to self-reflect, reason about their actions, and to adapt to new situations. In this respect, we propose implementation details of a knowledge taxonomy and an augmented data mining life cycle which supports a live integration of obtained models.


Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

arXiv.org Machine Learning

The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem, and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu's multi-way spectral clustering framework (CVPR 2005) and Ng et al.'s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).


Differential Privacy with Compression

arXiv.org Machine Learning

This work studies formal utility and privacy guarantees for a simple multiplicative database transformation, where the data are compressed by a random linear or affine transformation, reducing the number of data records substantially, while preserving the number of original input variables. We provide an analysis framework inspired by a recent concept known as differential privacy (Dwork 06). Our goal is to show that, despite the general difficulty of achieving the differential privacy guarantee, it is possible to publish synthetic data that are useful for a number of common statistical learning applications. This includes high dimensional sparse regression (Zhou et al. 07), principal component analysis (PCA), and other statistical measures (Liu et al. 06) based on the covariance of the initial data.