Goto

Collaborating Authors

 Industry



Unsupervised Color Decomposition Of Histologically Stained Tissue Samples

Neural Information Processing Systems

Accurate spectral decomposition is essential for the analysis and diagnosis of histologically stained tissue sections. In this paper we present the first automated system for performing this decomposition. Wecompare the performance of our system with ground truth data and report favorable results.


Link Prediction in Relational Data

Neural Information Processing Systems

Many real-world domains are relational in nature, consisting of a set of objects related to each other in complex ways. This paper focuses on predicting the existence and the type of links between entities in such domains. We apply the relational Markov network framework of Taskar et al. to define a joint probabilistic modelover the entire link graph -- entity attributes and links. The application of the RMN algorithm to this task requires the definition of probabilistic patterns over subgraph structures. We apply this method to two new relational datasets, one involving university webpages, and the other a social network. We show that the collective classification approach of RMNs, and the introduction of subgraph patterns over link labels, provide significant improvements in accuracy over flat classification, which attempts to predict each link in isolation.


A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters

Neural Information Processing Systems

Given an N N grid of squares, where each square has a count and an underlying population,our goal is to find the square region with the highest density, and to calculate its significance by randomization. Any density measure D, dependent on the total count and total population of a region, canbe used. For example, if each count represents the number of disease cases occurring in that square, we can use Kulldorff's spatial scan statistic D


Modeling User Rating Profiles For Collaborative Filtering

Neural Information Processing Systems

In this paper we present a generative latent variable model for rating-based collaborative filtering called the User Rating Profile model (URP). The generative process which underlies URP is designed toproduce complete user rating profiles, an assignment of one rating to each item for each user. Our model represents each user as a mixture of user attitudes, and the mixing proportions are distributed according to a Dirichlet random variable. The rating for each item is generated by selecting a user attitude for the item, and then selecting a rating according to the preference pattern associated withthat attitude. URP is related to several models including a multinomial mixture model, the aspect model [7], and LDA [1], but has clear advantages over each.


Parameterized Novelty Detectors for Environmental Sensor Monitoring

Neural Information Processing Systems

As part of an environmental observation and forecasting system, sensors deployed in the Columbia RIver Estuary (CORIE) gather information on physical dynamics and changes in estuary habitat. Ofthese, salinity sensors are particularly susceptible to biofouling, whichgradually degrades sensor response and corrupts critical data. Automatic fault detectors have the capability to identify bio-fouling early and minimize data loss. Complicating the development ofdiscriminatory classifiers is the scarcity of bio-fouling onset examples and the variability of the bio-fouling signature. To solve these problems, we take a novelty detection approach that incorporates a parameterized bio-fouling model. These detectors identify the occurrence of bio-fouling, and its onset time as reliably as human experts. Real-time detectors installed during the summer of2001 produced no false alarms, yet detected all episodes of sensor degradation before the field staff scheduled these sensors for cleaning. From this initial deployment through February 2003, our bio-fouling detectors have essentially doubled the amount of useful data coming from the CORIE sensors.


Markov Models for Automated ECG Interval Analysis

Neural Information Processing Systems

We examine the use of hidden Markov and hidden semi-Markov models forautomatically segmenting an electrocardiogram waveform into its constituent waveform features. An undecimated wavelet transform is used to generate an overcomplete representation of the signal that is more appropriate for subsequent modelling. We show that the state durations implicitin a standard hidden Markov model are ill-suited to those of real ECG features, and we investigate the use of hidden semi-Markov models for improved state duration modelling.


Fast Embedding of Sparse Similarity Graphs

Neural Information Processing Systems

This paper applies fast sparse multidimensional scaling (MDS) to a large graph of music similarity, with 267K vertices that represent artists, albums, andtracks; and 3.22M edges that represent similarity between those entities. Once vertices are assigned locations in a Euclidean space, the locations can be used to browse music and to generate playlists. MDS on very large sparse graphs can be effectively performed by a family of algorithms called Rectangular Dijsktra (RD) MDS algorithms. These RD algorithms operate on a dense rectangular slice of the distance matrix, created by calling Dijsktra a constant number of times. Two RD algorithms are compared: Landmark MDS, which uses the Nyström approximation toperform MDS; and a new algorithm called Fast Sparse Embedding, which uses FastMap. These algorithms compare favorably to Laplacian Eigenmaps, both in terms of speed and embedding quality.


A Model for Learning the Semantics of Pictures

Neural Information Processing Systems

We propose an approach to learning the semantics of images which allows usto automatically annotate an image with keywords and to retrieve images based on text queries. We do this using a formalism that models the generation of annotated images. We assume that every image is divided intoregions, each described by a continuous-valued feature vector. Given a training set of images with annotations, we compute a joint probabilistic modelof image features and words which allow us to predict the probability of generating a word given the image regions. This may be used to automatically annotate and retrieve images given a word as a query. Experiments show that our model significantly outperforms the best of the previously reported results on the tasks of automatic image annotation and retrieval.


Can We Learn to Beat the Best Stock

Neural Information Processing Systems

A novel algorithm for actively trading stocks is presented. While traditional universalalgorithms (and technical trading heuristics) attempt to predict winners or trends, our approach relies on predictable statistical relations between all pairs of stocks in the market. Our empirical results on historical markets provide strong evidence that this type of technical tradingcan "beat the market" and moreover, can beat the best stock in the market. In doing so we utilize a new idea for smoothing critical parameters in the context of expert learning.