Goto

Collaborating Authors

 Moore, Andrew W.


Dynamic Social Network Analysis using Latent Space Models

Neural Information Processing Systems

This paper explores two aspects of social network modeling. First, we generalize a successful static model of relationships into a dynamic model that accounts for friendships drifting over time. Second, we show how to make it tractable to learn such models from data, even as the number of entities n gets large.


Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery

Neural Information Processing Systems

In this paper we consider the problem of finding sets of points that conform to a given underlying model from within a dense, noisy set of observations. This problem is motivated by the task of efficiently linking faint asteroid detections, but is applicable to a range of spatial queries. We survey current tree-based approaches, showing a tradeoff exists between single tree and multiple tree algorithms. To this end, we present a new type of multiple tree algorithm that uses a variable number of trees to exploit the advantages of both approaches. We empirically show that this algorithm performs well using both simulated and astronomical data.



A Bayesian Spatial Scan Statistic

Neural Information Processing Systems

We propose a new Bayesian method for spatial cluster detection, the "Bayesian spatial scan statistic," and compare this method to the standard (frequentist) scan statistic approach. We demonstrate that the Bayesian statistic has several advantages over the frequentist approach, including increased power to detect clusters and (since randomization testing is unnecessary) much faster runtime. We evaluate the Bayesian and frequentist methodson the task of prospective disease surveillance: detecting spatial clusters of disease cases resulting from emerging disease outbreaks. Wedemonstrate that our Bayesian methods are successful in rapidly detecting outbreaks while keeping number of false positives low.


Dynamic Social Network Analysis using Latent Space Models

Neural Information Processing Systems

This paper explores two aspects of social network modeling. First, we generalize a successful static model of relationships into a dynamic model that accounts for friendships drifting over time. Second, we show how to make it tractable to learn such models from data, even as the number of entities n gets large.


Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery

Neural Information Processing Systems

In this paper we consider the problem of finding sets of points that conform toa given underlying model from within a dense, noisy set of observations. Thisproblem is motivated by the task of efficiently linking faint asteroid detections, but is applicable to a range of spatial queries. We survey current tree-based approaches, showing a tradeoff exists between singletree and multiple tree algorithms. To this end, we present a new type of multiple tree algorithm that uses a variable number of trees to exploit the advantages of both approaches. We empirically show that this algorithm performs well using both simulated and astronomical data.


Fast Information Value for Graphical Models

Neural Information Processing Systems

Calculations that quantify the dependencies between variables are vital to many operations with graphical models, e.g., active learning and sensitivity analysis.Previously, pairwise information gain calculation has involved a cost quadratic in network size. In this work, we show how to perform a similar computation with cost linear in network size. The loss function that allows this is of a form amenable to computation by dynamic programming. The message-passing algorithm that results is described and empirical results demonstrate large speedups without decrease inaccuracy. In the cost-sensitive domains examined, superior accuracy isachieved.


Detecting Significant Multidimensional Spatial Clusters

Neural Information Processing Systems

Each of these problems can be solved using a spatial scan statistic (Kulldorff, 1997), where we compute the maximum of a likelihood ratio statistic over all spatial regions, and find the significance of this region by randomization. However, computing the scan statistic for all spatial regions is generally computationally infeasible, so we introduce a novel fast spatial scan algorithm, generalizing the 2D scan algorithm of (Neill and Moore, 2004) to arbitrary dimensions. Our new multidimensional multiresolution algorithm allows us to find spatial clusters up to 1400x faster than the naive spatial scan, without any loss of accuracy.


Active Learning for Anomaly and Rare-Category Detection

Neural Information Processing Systems

We introduce a novel active-learning scenario in which a user wants to work with a learning algorithm to identify useful anomalies. These are distinguished from the traditional statistical definition of anomalies as outliers or merely ill-modeled points. Our distinction is that the usefulness of anomalies is categorized subjectively by the user. We make two additional assumptions. First, there exist extremely few useful anomalies to be hunted down within a massive dataset.


An Investigation of Practical Approximate Nearest Neighbor Algorithms

Neural Information Processing Systems

This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perceptionareas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH).In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also introduce newapproximate k-NN search algorithms on this structure. We show why these structures should be able to exploit the same randomprojection-based approximationsthat LSH enjoys, but with a simpler algorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show up to 31-fold accelerations over LSH. This result holds true throughout the spectrum of approximation levels.