Industry
TRANSIT Routing on Video Game Maps
Antsfeld, Leonid (NICTA and The University of New South Wales) | Harabor, Daniel Damir (NICTA and The Australian National University) | Kilby, Philip (NICTA and The Australian National University) | Walsh, Toby (NICTA and The University of New South Wales)
TRANSIT is a fast and optimal technique for computing shortest path costs in road networks. It is attractive for its usually modest memory requirements and impressive running times. In this paper we give a first analysis of TRANSIT routing on a set of popular grid-based video-game benchmarks taken from the AI pathfinding literature. We show that in the presence of path symmetries, which are inherent to most grids but normally not road networks, TRANSIT is strongly and negatively impacted, both in terms of performance and memory requirements. We address this problem by developing a new general symmetry breaking technique which adds small random epsilon-values to edges in the search graph, reducing the size of the TRANSIT network by up to 4 times while preserving optimality. Using our enhancements TRANSIT achieves up to four orders of magnitude speed improvement vs. A* search and uses in many cases only a small (<=10MB) or modest (<= 50MB) amount of memory. We also compare TRANSIT with CPDs, a recent and very fast database-driven pathfinding approach. We find the algorithms have complementary strengths but also identify a class of problems for which TRANSIT is up to two orders of magnitude faster than CPDs using a comparable amount of memory.
D-FLAT: Declarative Problem Solving Using Tree Decompositions and Answer-Set Programming
Bliem, Bernhard, Morak, Michael, Woltran, Stefan
In this work, we propose Answer-Set Programming (ASP) as a tool for rapid prototyping of dynamic programming algorithms based on tree decompositions. In fact, many such algorithms have been designed, but only a few of them found their way into implementation. The main obstacle is the lack of easy-to-use systems which (i) take care of building a tree decomposition and (ii) provide an interface for declarative specifications of dynamic programming algorithms. In this paper, we present D-FLAT, a novel tool that relieves the user of having to handle all the technical details concerned with parsing, tree decomposition, the handling of data structures, etc. Instead, it is only the dynamic programming algorithm itself which has to be specified in the ASP language. D-FLAT employs an ASP solver in order to compute the local solutions in the dynamic programming algorithm. In the paper, we give a few examples illustrating the use of D-FLAT and describe the main features of the system. Moreover, we report experiments which show that ASPbased D-FLAT encodings for some problems outperform monolithic ASP encodings on instances of small treewidth. To appear in Theory and Practice of Logic Programming (TPLP).
Designing various component analysis at will
Kimura, Akisato, Sugiyama, Masashi, Hitoshi, Sakano, Kameoka, Hirokazu
This paper provides a generic framework of component analysis (CA) methods introducing a new expression for scatter matrices and Gram matrices, called Generalized Pairwise Expression (GPE). This expression is quite compact but highly powerful: The framework includes not only (1) the standard CA methods but also (2) several regularization techniques, (3) weighted extensions, (4) some clustering methods, and (5) their semi-supervised extensions. This paper also presents quite a simple methodology for designing a desired CA method from the proposed framework: Adopting the known GPEs as templates, and generating a new method by combining these templates appropriately.
Modularity-Based Clustering for Network-Constrained Trajectories
Mahrsi, Mohamed Khalil El, Rossi, Fabrice
We present a novel clustering approach for moving object trajectories that are constrained by an underlying road network. The approach builds a similarity graph based on these trajectories then uses modularity-optimization hiearchical graph clustering to regroup trajectories with similar profiles. Our experimental study shows the superiority of the proposed approach over classic hierarchical clustering and gives a brief insight to visualization of the clustering results.
Automatic Relevance Determination in Nonnegative Matrix Factorization with the \beta-Divergence
Tan, Vincent Y. F., Fรฉvotte, Cรฉdric
This paper addresses the estimation of the latent dimensionality in nonnegative matrix factorization (NMF) with the \beta-divergence. The \beta-divergence is a family of cost functions that includes the squared Euclidean distance, Kullback-Leibler and Itakura-Saito divergences as special cases. Learning the model order is important as it is necessary to strike the right balance between data fidelity and overfitting. We propose a Bayesian model based on automatic relevance determination in which the columns of the dictionary matrix and the rows of the activation matrix are tied together through a common scale parameter in their prior. A family of majorization-minimization algorithms is proposed for maximum a posteriori (MAP) estimation. A subset of scale parameters is driven to a small lower bound in the course of inference, with the effect of pruning the corresponding spurious components. We demonstrate the efficacy and robustness of our algorithms by performing extensive experiments on synthetic data, the swimmer dataset, a music decomposition example and a stock price prediction task.
Unfolding Latent Tree Structures using 4th Order Tensors
Ishteva, Mariya, Park, Haesun, Song, Le
Discovering the latent structure from many observed variables is an important yet challenging learning task. Existing approaches for discovering latent structures often require the unknown number of hidden states as an input. In this paper, we propose a quartet based approach which is \emph{agnostic} to this number. The key contribution is a novel rank characterization of the tensor associated with the marginal distribution of a quartet. This characterization allows us to design a \emph{nuclear norm} based test for resolving quartet relations. We then use the quartet test as a subroutine in a divide-and-conquer algorithm for recovering the latent tree structure. Under mild conditions, the algorithm is consistent and its error probability decays exponentially with increasing sample size. We demonstrate that the proposed approach compares favorably to alternatives. In a real world stock dataset, it also discovers meaningful groupings of variables, and produces a model that fits the data better.
Fast Conical Hull Algorithms for Near-separable Non-negative Matrix Factorization
Kumar, Abhishek, Sindhwani, Vikas, Kambadur, Prabhanjan
The separability assumption (Donoho & Stodden, 2003; Arora et al., 2012) turns non-negative matrix factorization (NMF) into a tractable problem. Recently, a new class of provably-correct NMF algorithms have emerged under this assumption. In this paper, we reformulate the separable NMF problem as that of finding the extreme rays of the conical hull of a finite set of vectors. From this geometric perspective, we derive new separable NMF algorithms that are highly scalable and empirically noise robust, and have several other favorable properties in relation to existing methods. A parallel implementation of our algorithm demonstrates high scalability on shared- and distributed-memory machines.
Predicting human preferences using the block structure of complex social networks
Guimera, Roger, Llorente, Alejandro, Moro, Esteban, Sales-Pardo, Marta
With ever-increasing available data, predicting individuals' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a "new" computational social science. Here, we propose a novel approach based on stochastic block models, which have been developed by sociologists as plausible models of complex networks of social interactions. Our model is in the spirit of predicting individuals' preferences based on the preferences of others but, rather than fitting a particular model, we rely on a Bayesian approach that samples over the ensemble of all possible models. We show that our approach is considerably more accurate than leading recommender algorithms, with major relative improvements between 38% and 99% over industry-level algorithms. Besides, our approach sheds light on decision-making processes by identifying groups of individuals that have consistently similar preferences, and enabling the analysis of the characteristics of those groups.
Distributed High Dimensional Information Theoretical Image Registration via Random Projections
Szabo, Zoltan, Lorincz, Andras
However, the estimation of these quantities is computationally intensive in high dimensions. On the other hand, consistent estimation from pairwise distances of the sample points is possible, which suits random projection(RP) based low dimensional embeddings. We adapt the RP technique to this task by means of a simple ensemble method. To the best of our knowledge, this is the first distributed, RP based information theoretical image registration approach. The efficiency of the method is demonstrated through numerical examples. Keywords: random projection, information theoretical image registration, high dimensional features, distributed solution 1. Introduction Machine learning methods are notoriously limited by the high dimensional nature of the data. This problem may be alleviated via the random projection (RP) technique, which has been successfully applied, e.g., in the fields of
Graph-Based Approaches to Clustering Network-Constrained Trajectory Data
Mahrsi, Mohamed Khalil El, Rossi, Fabrice
Even though clustering trajectory data attracted considerable attention in the last few years, most of prior work assumed that moving objects can move freely in an euclidean space and did not consider the eventual presence of an underlying road network and its influence on evaluating the similarity between trajectories. In this paper, we present two approaches to clustering network-constrained trajectory data. The first approach discovers clusters of trajectories that traveled along the same parts of the road network. The second approach is segment-oriented and aims to group together road segments based on trajectories that they have in common. Both approaches use a graph model to depict the interactions between observations w.r.t. their similarity and cluster this similarity graph using a community detection algorithm. We also present experimental results obtained on synthetic data to showcase our propositions.