Goto

Collaborating Authors

 Genre


Adding noise to the input of a model trained with a regularized objective

arXiv.org Artificial Intelligence

Regularization is a well studied problem in the context of neural networks. It is usually used to improve the generalization performance when the number of input samples is relatively small or heavily contaminated with noise. The regularization of a parametric model can be achieved in different manners some of which are early stopping (Morgan and Bourlard, 1990), weight decay, output smoothing that are used to avoid overfitting during the training of the considered model. From a Bayesian point of view, many regularization techniques correspond to imposing certain prior distributions on model parameters (Krogh and Hertz, 1991). Using Bishop's approximation (Bishop, 1995) of the objective function when a restricted type of noise is added to the input of a parametric function, we derive the higher order terms of the Taylor expansion and analyze the coefficients of the regularization terms induced by the noisy input. In particular we study the effect of penalizing the Hessian of the mapping function with respect to the input in terms of generalization performance. We also show how we can control independently this coefficient by explicitly penalizing the Jacobian of the mapping function on corrupted inputs.


Polyethism in a colony of artificial ants

arXiv.org Artificial Intelligence

We explore self-organizing strategies for role assignment in a foraging task carried out by a colony of artificial agents. Our strategies are inspired by various mechanisms of division of labor (polyethism) observed in eusocial insects like ants, termites, or bees. Specifically we instantiate models of caste polyethism and age or temporal polyethism to evaluated the benefits to foraging in a dynamic environment. Our experiment is directly related to the exploration/exploitation trade of in machine learning.


Augmenting Tractable Fragments of Abstract Argumentation

arXiv.org Artificial Intelligence

We present a new and compelling approach to the efficient solution of important computational problems that arise in the context of abstract argumentation. Our approach makes known algorithms defined for restricted fragments generally applicable, at a computational cost that scales with the distance from the fragment. Thus, in a certain sense, we gradually augment tractable fragments. Surprisingly, it turns out that some tractable fragments admit such an augmentation and that others do not. More specifically, we show that the problems of credulous and skeptical acceptance are fixed-parameter tractable when parameterized by the distance from the fragment of acyclic argumentation frameworks. Other tractable fragments such as the fragments of symmetrical and bipartite frameworks seem to prohibit an augmentation: the acceptance problems are already intractable for frameworks at distance 1 from the fragments. For our study we use a broad setting and consider several different semantics. For the algorithmic results we utilize recent advances in fixed-parameter tractability.


Bayesian inference for queueing networks and modeling of internet services

arXiv.org Machine Learning

Modern Internet services, such as those at Google, Yahoo!, and Amazon, handle billions of requests per day on clusters of thousands of computers. Because these services operate under strict performance requirements, a statistical understanding of their performance is of great practical interest. Such services are modeled by networks of queues, where each queue models one of the computers in the system. A key challenge is that the data are incomplete, because recording detailed information about every request to a heavily used system can require unacceptable overhead. In this paper we develop a Bayesian perspective on queueing models in which the arrival and departure times that are not observed are treated as latent variables. Underlying this viewpoint is the observation that a queueing model defines a deterministic transformation between the data and a set of independent variables called the service times. With this viewpoint in hand, we sample from the posterior distribution over missing data and model parameters using Markov chain Monte Carlo. We evaluate our framework on data from a benchmark Web application. We also present a simple technique for selection among nested queueing models. We are unaware of any previous work that considers inference in networks of queues in the presence of missing data.


Computing Small Unsatisfiable Cores in Satisfiability Modulo Theories

Journal of Artificial Intelligence Research

The problem of finding small unsatisfiable cores for SAT formulas has recently received a lot of interest, mostly for its applications in formal verification. However, propositional logic is often not expressive enough for representing many interesting verification problems, which can be more naturally addressed in the framework of Satisfiability Modulo Theories, SMT. Surprisingly, the problem of finding unsatisfiable cores in SMT has received very little attention in the literature. In this paper we present a novel approach to this problem, called the Lemma-Lifting approach. The main idea is to combine an SMT solver with an external propositional core extractor. The SMT solver produces the theory lemmas found during the search, dynamically lifting the suitable amount of theory information to the Boolean level. The core extractor is then called on the Boolean abstraction of the original SMT problem and of the theory lemmas. This results in an unsatisfiable core for the original SMT problem, once the remaining theory lemmas are removed. The approach is conceptually interesting, and has several advantages in practice. In fact, it is extremely simple to implement and to update, and it can be interfaced with every propositional core extractor in a plug-and-play manner, so as to benefit for free of all unsat-core reduction techniques which have been or will be made available. We have evaluated our algorithm with a very extensive empirical test on SMT-LIB benchmarks, which confirms the validity and potential of this approach.


Slicing: Nonsingular Estimation of High Dimensional Covariance Matrices Using Multiway Kronecker Delta Covariance Structures

arXiv.org Machine Learning

Nonsingular estimation of high dimensional covariance matrices is an important step in many statistical procedures like classification, clustering, variable selection an future extraction. After a review of the essential background material, this paper introduces a technique we call slicing for obtaining a nonsingular covariance matrix of high dimensional data. Slicing is essentially assuming that the data has Kronecker delta covariance structure. Finally, we discuss the implications of the results in this paper and provide an example of classification for high dimensional gene expression data.


Foundations for Uniform Interpolation and Forgetting in Expressive Description Logics

arXiv.org Artificial Intelligence

We study uniform interpolation and forgetting in the description logic ALC. Our main results are model-theoretic characterizations of uniform inter- polants and their existence in terms of bisimula- tions, tight complexity bounds for deciding the existence of uniform interpolants, an approach to computing interpolants when they exist, and tight bounds on their size. We use a mix of model- theoretic and automata-theoretic methods that, as a by-product, also provides characterizations of and decision procedures for conservative extensions.


Kernels for Global Constraints

arXiv.org Artificial Intelligence

Bessiere et al. (AAAI'08) showed that several intractable global constraints can be efficiently propagated when certain natural problem parameters are small. In particular, the complete propagation of a global constraint is fixed-parameter tractable in k - the number of holes in domains - whenever bound consistency can be enforced in polynomial time; this applies to the global constraints AtMost-NValue and Extended Global Cardinality (EGC). In this paper we extend this line of research and introduce the concept of reduction to a problem kernel, a key concept of parameterized complexity, to the field of global constraints. In particular, we show that the consistency problem for AtMost-NValue constraints admits a linear time reduction to an equivalent instance on O(k^2) variables and domain values. This small kernel can be used to speed up the complete propagation of NValue constraints. We contrast this result by showing that the consistency problem for EGC constraints does not admit a reduction to a polynomial problem kernel unless the polynomial hierarchy collapses.


Foundations for Understanding and Building Conscious Systems using Stable Parallel Looped Dynamics

arXiv.org Artificial Intelligence

The problem of consciousness faced several challenges for a few reasons: (a) a lack of necessary and sufficient conditions, without which we would not know how close we are to the solution, (b) a lack of a synthesis framework to build conscious systems and (c) a lack of mechanisms explaining the transition between the lower-level chemical dynamics and the higher-level abstractions. In this paper, I address these issues using a new framework. The central result is that a person is 'minimally' conscious if and only if he knows at least one truth. This lets us move away from the vagueness surrounding consciousness and instead focus equivalently on: (i) what truths are and how our brain represents/relates them to each other and (ii) how we attain a feeling of knowing for a truth. For the former problem, since truths are things that do not change, I replace the abstract notion with a dynamical one called fixed sets. These sets are guaranteed to exist for our brain and other stable parallel looped systems. The relationships between everyday events are now built using relationships between fixed sets, until our brain creates a unique dynamical state called the self-sustaining threshold 'membrane' of fixed sets. For the latter problem, I present necessary and sufficient conditions for attaining a feeling of knowing using a definition of continuity applied to abstractions. Combining these results, I now say that a person is minimally conscious if and only if his brain has a self-sustaining dynamical membrane with abstract continuous paths. A synthetic system built to satisfy this equivalent self-sustaining membrane condition appears indistinguishable from human consciousness.


Asymptotic Normality of Support Vector Machine Variants and Other Regularized Kernel Methods

arXiv.org Machine Learning

In nonparametric classification and regression problems, regularized kernel methods, in particular support vector machines, attract much attention in theoretical and in applied statistics. In an abstract sense, regularized kernel methods (simply called SVMs here) can be seen as regularized M-estimators for a parameter in a (typically infinite dimensional) reproducing kernel Hilbert space. For smooth loss functions, it is shown that the difference between the estimator, i.e.\ the empirical SVM, and the theoretical SVM is asymptotically normal with rate $\sqrt{n}$. That is, the standardized difference converges weakly to a Gaussian process in the reproducing kernel Hilbert space. As common in real applications, the choice of the regularization parameter may depend on the data. The proof is done by an application of the functional delta-method and by showing that the SVM-functional is suitably Hadamard-differentiable.