Goto

Collaborating Authors

 Genre


TRANSIT Routing on Video Game Maps

AAAI Conferences

TRANSIT is a fast and optimal technique for computing shortest path costs in road networks. It is attractive for its usually modest memory requirements and impressive running times. In this paper we give a first analysis of TRANSIT routing on a set of popular grid-based video-game benchmarks taken from the AI pathfinding literature. We show that in the presence of path symmetries, which are inherent to most grids but normally not road networks, TRANSIT is strongly and negatively impacted, both in terms of performance and memory requirements. We address this problem by developing a new general symmetry breaking technique which adds small random epsilon-values to edges in the search graph, reducing the size of the TRANSIT network by up to 4 times while preserving optimality. Using our enhancements TRANSIT achieves up to four orders of magnitude speed improvement vs. A* search and uses in many cases only a small (<=10MB) or modest (<= 50MB) amount of memory. We also compare TRANSIT with CPDs, a recent and very fast database-driven pathfinding approach. We find the algorithms have complementary strengths but also identify a class of problems for which TRANSIT is up to two orders of magnitude faster than CPDs using a comparable amount of memory.


On the Sample Complexity of Predictive Sparse Coding

arXiv.org Machine Learning

The goal of predictive sparse coding is to learn a representation of examples as sparse linear combinations of elements from a dictionary, such that a learned hypothesis linear in the new representation performs well on a predictive task. Predictive sparse coding algorithms recently have demonstrated impressive performance on a variety of supervised tasks, but their generalization properties have not been studied. We establish the first generalization error bounds for predictive sparse coding, covering two settings: 1) the overcomplete setting, where the number of features k exceeds the original dimensionality d; and 2) the high or infinite-dimensional setting, where only dimension-free bounds are useful. Both learning bounds intimately depend on stability properties of the learned sparse encoder, as measured on the training sample. Consequently, we first present a fundamental stability result for the LASSO, a result characterizing the stability of the sparse codes with respect to perturbations to the dictionary. In the overcomplete setting, we present an estimation error bound that decays as \tilde{O}(sqrt(d k/m)) with respect to d and k. In the high or infinite-dimensional setting, we show a dimension-free bound that is \tilde{O}(sqrt(k^2 s / m)) with respect to k and s, where s is an upper bound on the number of non-zeros in the sparse code for any training data point.


Probability Bracket Notation, Multivariable Systems and Static Bayesian Networks

arXiv.org Artificial Intelligence

Probability Bracket Notation (PBN) is applied to systems of multiple random variables for preliminary study of static Bayesian Networks (BN) and Probabilistic Graphic Models (PGM). The famous Student BN Example is explored to show the local independences and reasoning power of a BN. Software package Elvira is used to graphically display the student BN. Our investigation shows that PBN provides a consistent and convenient alternative to manipulate many expressions related to joint, marginal and conditional probability distributions in static BN.


Inference in Probabilistic Logic Programs with Continuous Random Variables

arXiv.org Artificial Intelligence

Probabilistic Logic Programming (PLP), exemplified by Sato and Kameya's PRISM, Poole's ICL, Raedt et al's ProbLog and Vennekens et al's LPAD, is aimed at combining statistical and logical knowledge representation and inference. A key characteristic of PLP frameworks is that they are conservative extensions to non-probabilistic logic programs which have been widely used for knowledge representation. PLP frameworks extend traditional logic programming semantics to a distribution semantics, where the semantics of a probabilistic logic program is given in terms of a distribution over possible models of the program. However, the inference techniques used in these works rely on enumerating sets of explanations for a query answer. Consequently, these languages permit very limited use of random variables with continuous distributions. In this paper, we present a symbolic inference procedure that uses constraints and represents sets of explanations without enumeration. This permits us to reason over PLPs with Gaussian or Gamma-distributed random variables (in addition to discrete-valued random variables) and linear equality constraints over reals. We develop the inference procedure in the context of PRISM; however the procedure's core ideas can be easily applied to other PLP languages as well. An interesting aspect of our inference procedure is that PRISM's query evaluation process becomes a special case in the absence of any continuous random variables in the program. The symbolic inference procedure enables us to reason over complex probabilistic models such as Kalman filters and a large subclass of Hybrid Bayesian networks that were hitherto not possible in PLP frameworks. (To appear in Theory and Practice of Logic Programming).


Feature Selection via L1-Penalized Squared-Loss Mutual Information

arXiv.org Machine Learning

Feature selection is a technique to screen out less important features. Many existing supervised feature selection algorithms use redundancy and relevancy as the main criteria to select features. However, feature interaction, potentially a key characteristic in real-world problems, has not received much attention. As an attempt to take feature interaction into account, we propose L1-LSMI, an L1-regularization based algorithm that maximizes a squared-loss variant of mutual information between selected features and outputs. Numerical results show that L1-LSMI performs well in handling redundancy, detecting non-linear dependency, and considering feature interaction.


D-FLAT: Declarative Problem Solving Using Tree Decompositions and Answer-Set Programming

arXiv.org Artificial Intelligence

In this work, we propose Answer-Set Programming (ASP) as a tool for rapid prototyping of dynamic programming algorithms based on tree decompositions. In fact, many such algorithms have been designed, but only a few of them found their way into implementation. The main obstacle is the lack of easy-to-use systems which (i) take care of building a tree decomposition and (ii) provide an interface for declarative specifications of dynamic programming algorithms. In this paper, we present D-FLAT, a novel tool that relieves the user of having to handle all the technical details concerned with parsing, tree decomposition, the handling of data structures, etc. Instead, it is only the dynamic programming algorithm itself which has to be specified in the ASP language. D-FLAT employs an ASP solver in order to compute the local solutions in the dynamic programming algorithm. In the paper, we give a few examples illustrating the use of D-FLAT and describe the main features of the system. Moreover, we report experiments which show that ASPbased D-FLAT encodings for some problems outperform monolithic ASP encodings on instances of small treewidth. To appear in Theory and Practice of Logic Programming (TPLP).


Relative Expressiveness of Defeasible Logics

arXiv.org Artificial Intelligence

We address the relative expressiveness of defeasible logics in the framework DL. Relative expressiveness is formulated as the ability to simulate the reasoning of one logic within another logic. We show that such simulations must be modular, in the sense that they also work if applied only to part of a theory, in order to achieve a useful notion of relative expressiveness. We present simulations showing that logics in DL with and without the capability of team defeat are equally expressive. We also show that logics that handle ambiguity differently -- ambiguity blocking versus ambiguity propagating -- have distinct expressiveness, with neither able to simulate the other under a different formulation of expressiveness.


Designing various component analysis at will

arXiv.org Machine Learning

This paper provides a generic framework of component analysis (CA) methods introducing a new expression for scatter matrices and Gram matrices, called Generalized Pairwise Expression (GPE). This expression is quite compact but highly powerful: The framework includes not only (1) the standard CA methods but also (2) several regularization techniques, (3) weighted extensions, (4) some clustering methods, and (5) their semi-supervised extensions. This paper also presents quite a simple methodology for designing a desired CA method from the proposed framework: Adopting the known GPEs as templates, and generating a new method by combining these templates appropriately.


Modularity-Based Clustering for Network-Constrained Trajectories

arXiv.org Machine Learning

We present a novel clustering approach for moving object trajectories that are constrained by an underlying road network. The approach builds a similarity graph based on these trajectories then uses modularity-optimization hiearchical graph clustering to regroup trajectories with similar profiles. Our experimental study shows the superiority of the proposed approach over classic hierarchical clustering and gives a brief insight to visualization of the clustering results.


Automatic Relevance Determination in Nonnegative Matrix Factorization with the \beta-Divergence

arXiv.org Machine Learning

This paper addresses the estimation of the latent dimensionality in nonnegative matrix factorization (NMF) with the \beta-divergence. The \beta-divergence is a family of cost functions that includes the squared Euclidean distance, Kullback-Leibler and Itakura-Saito divergences as special cases. Learning the model order is important as it is necessary to strike the right balance between data fidelity and overfitting. We propose a Bayesian model based on automatic relevance determination in which the columns of the dictionary matrix and the rows of the activation matrix are tied together through a common scale parameter in their prior. A family of majorization-minimization algorithms is proposed for maximum a posteriori (MAP) estimation. A subset of scale parameters is driven to a small lower bound in the course of inference, with the effect of pruning the corresponding spurious components. We demonstrate the efficacy and robustness of our algorithms by performing extensive experiments on synthetic data, the swimmer dataset, a music decomposition example and a stock price prediction task.