Goto

Collaborating Authors

 Country


Dynamic Large Spatial Covariance Matrix Estimation in Application to Semiparametric Model Construction via Variable Clustering: the SCE approach

arXiv.org Machine Learning

To better understand the spatial structure of large panels of economic and financial time series and provide a guideline for constructing semiparametric models, this paper first considers estimating a large spatial covariance matrix of the generalized $m$-dependent and $\beta$-mixing time series (with $J$ variables and $T$ observations) by hard thresholding regularization as long as ${{\log J \, \cx^*(\ct)}}/{T} = \Co(1)$ (the former scheme with some time dependence measure $\cx^*(\ct)$) or $\log J /{T} = \Co(1)$ (the latter scheme with some upper bounded mixing coefficient). We quantify the interplay between the estimators' consistency rate and the time dependence level, discuss an intuitive resampling scheme for threshold selection, and also prove a general cross-validation result justifying this. Given a consistently estimated covariance (correlation) matrix, by utilizing its natural links with graphical models and semiparametrics, after "screening" the (explanatory) variables, we implement a novel forward (and backward) label permutation procedure to cluster the "relevant" variables and construct the corresponding semiparametric model, which is further estimated by the groupwise dimension reduction method with sign constraints. We call this the SCE (screen - cluster - estimate) approach for modeling high dimensional data with complex spatial structure. Finally we apply this method to study the spatial structure of large panels of economic and financial time series and find the proper semiparametric structure for estimating the consumer price index (CPI) to illustrate its superiority over the linear models.


Relative Density-Ratio Estimation for Robust Distribution Comparison

arXiv.org Machine Learning

Divergence estimators based on direct approximation of density-ratios without going through separate approximation of numerator and denominator densities have been successfully applied to machine learning tasks that involve distribution comparison such as outlier detection, transfer learning, and two-sample homogeneity test. However, since density-ratio functions often possess high fluctuation, divergence estimation is still a challenging task in practice. In this paper, we propose to use relative divergences for distribution comparison, which involves approximation of relative density-ratios. Since relative density-ratios are always smoother than corresponding ordinary density-ratios, our proposed method is favorable in terms of the non-parametric convergence speed. Furthermore, we show that the proposed divergence estimator has asymptotic variance independent of the model complexity under a parametric setup, implying that the proposed estimator hardly overfits even with complex models. Through experiments, we demonstrate the usefulness of the proposed approach.


Predictive Active Set Selection Methods for Gaussian Processes

arXiv.org Machine Learning

We propose an active set selection framework for Gaussian process classification for cases when the dataset is large enough to render its inference prohibitive. Our scheme consists of a two step alternating procedure of active set update rules and hyperparameter optimization based upon marginal likelihood maximization. The active set update rules rely on the ability of the predictive distributions of a Gaussian process classifier to estimate the relative contribution of a datapoint when being either included or removed from the model. This means that we can use it to include points with potentially high impact to the classifier decision process while removing those that are less relevant. We introduce two active set rules based on different criteria, the first one prefers a model with interpretable active set parameters whereas the second puts computational complexity first, thus a model with active set parameters that directly control its complexity. We also provide both theoretical and empirical support for our active set selection strategy being a good approximation of a full Gaussian process classifier. Our extensive experiments show that our approach can compete with state-of-the-art classification techniques with reasonable time complexity. Source code publicly available at http://cogsys.imm.dtu.dk/passgp.


Sparse Linear Identifiable Multivariate Modeling

arXiv.org Machine Learning

In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.


Wrapper Maintenance: A Machine Learning Approach

arXiv.org Artificial Intelligence

The proliferation of online information sources has led to an increased use of wrappers for extracting data from Web sources. While most of the previous research has focused on quick and efficient generation of wrappers, the development of tools for wrapper maintenance has received less attention. This is an important research problem because Web sources often change in ways that prevent the wrappers from extracting data correctly. We present an efficient algorithm that learns structural information about data from positive examples alone. We describe how this information can be used for two wrapper maintenance applications: wrapper verification and reinduction. The wrapper verification system detects when a wrapper is not extracting correct data, usually because the Web source has changed its format. The reinduction algorithm automatically recovers from changes in the Web source by identifying data on Web pages so that a new wrapper may be generated for this source. To validate our approach, we monitored 27 wrappers over a period of a year. The verification algorithm correctly discovered 35 of the 37 wrapper changes, and made 16 mistakes, resulting in precision of 0.73 and recall of 0.95. We validated the reinduction algorithm on ten Web sources. We were able to successfully reinduce the wrappers, obtaining precision and recall values of 0.90 and 0.80 on the data extraction task.


An Architectural Approach to Ensuring Consistency in Hierarchical Execution

arXiv.org Artificial Intelligence

Hierarchical task decomposition is a method used in many agent systems to organize agent knowledge. This work shows how the combination of a hierarchy and persistent assertions of knowledge can lead to difficulty in maintaining logical consistency in asserted knowledge. We explore the problematic consequences of persistent assumptions in the reasoning process and introduce novel potential solutions. Having implemented one of the possible solutions, Dynamic Hierarchical Justification, its effectiveness is demonstrated with an empirical analysis.


SHOP2: An HTN Planning System

arXiv.org Artificial Intelligence

The SHOP2 planning system received one of the awards for distinguished performance in the 2002 International Planning Competition. This paper describes the features of SHOP2 which enabled it to excel in the competition, especially those aspects of SHOP2 that deal with temporal and metric planning domains.


VHPOP: Versatile Heuristic Partial Order Planner

arXiv.org Artificial Intelligence

VHPOP is a partial order causal link (POCL) planner loosely based on UCPOP. It draws from the experience gained in the early to mid 1990's on flaw selection strategies for POCL planning, and combines this with more recent developments in the field of domain independent planning such as distance based heuristics and reachability analysis. We present an adaptation of the additive heuristic for plan space planning, and modify it to account for possible reuse of existing actions in a plan. We also propose a large set of novel flaw selection strategies, and show how these can help us solve more problems than previously possible by POCL planners. VHPOP also supports planning with durative actions by incorporating standard techniques for temporal constraint reasoning. We demonstrate that the same heuristic techniques used to boost the performance of classical POCL planning can be effective in domains with durative actions as well. The result is a versatile heuristic POCL planner competitive with established CSP-based and heuristic state space planners.


Compiling Causal Theories to Successor State Axioms and STRIPS-Like Systems

arXiv.org Artificial Intelligence

We describe a system for specifying the effects of actions. Unlike those commonly used in AI planning, our system uses an action description language that allows one to specify the effects of actions using domain rules, which are state constraints that can entail new action effects from old ones. Declaratively, an action domain in our language corresponds to a nonmonotonic causal theory in the situation calculus. Procedurally, such an action domain is compiled into a set of logical theories, one for each action in the domain, from which fully instantiated successor state-like axioms and STRIPS-like systems are then generated. We expect the system to be a useful tool for knowledge engineers writing action specifications for classical AI planning systems, GOLOG systems, and other systems where formal specifications of actions are needed.


Exploiting Contextual Independence In Probabilistic Inference

arXiv.org Artificial Intelligence

Bayesian belief networks have grown to prominence because they provide compact representations for many problems for which probabilistic inference is appropriate, and there are algorithms to exploit this compactness. The next step is to allow compact representations of the conditional probabilities of a variable given its parents. In this paper we present such a representation that exploits contextual independence in terms of parent contexts; which variables act as parents may depend on the value of other variables. The internal representation is in terms of contextual factors (confactors) that is simply a pair of a context and a table. The algorithm, contextual variable elimination, is based on the standard variable elimination algorithm that eliminates the non-query variables in turn, but when eliminating a variable, the tables that need to be multiplied can depend on the context. This algorithm reduces to standard variable elimination when there is no contextual independence structure to exploit. We show how this can be much more efficient than variable elimination when there is structure to exploit. We explain why this new method can exploit more structure than previous methods for structured belief network inference and an analogous algorithm that uses trees.