Goto

Collaborating Authors

 Data Science


Workshop on Intelligent Information Integration (III-99)

AI Magazine

The Workshop on Intelligent Information Integration (III), organized in conjunction with the Sixteenth International Joint Conference on Artificial Intelligence, was held on 31 July 1999 in Stockholm, Sweden. Approximately 40 people participated, and nearly 20 papers were presented. This packed workshop schedule resulted from a large number of submissions that made it difficult to reserve discussion time without rejecting an unproportionately large number of papers. Participants included scientists and practitioners from industry and academia.


Graph Matching for Shape Retrieval

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


Basis Selection for Wavelet Regression

Neural Information Processing Systems

The initial assumption is that the original data samples lie in the finest space Vo, which is spanned by the scaling function,p E Vo such that the collection {,p( x -t) It E Z} is a Riesz basis of Vo .


Graph Matching for Shape Retrieval

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refersto a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution,representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


Visualizing Group Structure

Neural Information Processing Systems

Cluster analysis is a fundamental principle in exploratory data analysis, providing the user with a description of the group structure ofgiven data. A key problem in this context is the interpretation andvisualization of clustering solutions in high-dimensional or abstract data spaces. In particular, probabilistic descriptions of the group structure, essential to capture inter-cluster relationships, arehardly assessable by simple inspection ofthe probabilistic assignment variables. VVe present a novel approach to the visualization ofgroup structure. It is based on a statistical model of the object assignments which have been observed or estimated by a probabilistic clustering procedure. The objects or data points are embedded in a low dimensional Euclidean space by approximating the observed data statistics with a Gaussian mixture model. The algorithm provides a new approach to the visualization of the inherent structurefor a broad variety of data types, e.g.


Graph Matching for Shape Retrieval

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


Visualizing Group Structure

Neural Information Processing Systems

Cluster analysis is a fundamental principle in exploratory data analysis, providing the user with a description of the group structure of given data. A key problem in this context is the interpretation and visualization of clustering solutions in high-dimensional or abstract data spaces. In particular, probabilistic descriptions of the group structure, essential to capture inter-cluster relationships, are hardly assessable by simple inspection ofthe probabilistic assignment variables. VVe present a novel approach to the visualization of group structure. It is based on a statistical model of the object assignments which have been observed or estimated by a probabilistic clustering procedure. The objects or data points are embedded in a low dimensional Euclidean space by approximating the observed data statistics with a Gaussian mixture model. The algorithm provides a new approach to the visualization of the inherent structure for a broad variety of data types, e.g.


Basis Selection for Wavelet Regression

Neural Information Processing Systems

A wavelet basis selection procedure is presented for wavelet regression. Both the basis and threshold are selected using crossvalidation. The method includes the capability of incorporating prior knowledge on the smoothness (or shape of the basis functions) into the basis selection procedure. The results of the method are demonstrated using widely published sampled functions. The results of the method are contrasted with other basis function based methods.


Familiarity Discrimination of Radar Pulses

Neural Information Processing Systems

H3C 3A 7 CAN ADA 2Department of Cognitive and Neural Systems, Boston University Boston, MA 02215 USA Abstract The ARTMAP-FD neural network performs both identification (placing test patterns in classes encountered during training) and familiarity discrimination (judging whether a test pattern belongs to any of the classes encountered during training). The performance of ARTMAP-FD is tested on radar pulse data obtained in the field, and compared to that of the nearest-neighbor-based NEN algorithm and to a k 1 extension of NEN. 1 Introduction The recognition process involves both identification and familiarity discrimination. Consider, for example, a neural network designed to identify aircraft based on their radar reflections and trained on sample reflections from ten types of aircraft A... J. After training, the network should correctly classify radar reflections belonging to the familiar classes A... J, but it should also abstain from making a meaningless guess when presented with a radar reflection from an object belonging to a different, unfamiliar class. Familiarity discrimination is also referred to as "novelty detection," a "reject option," and "recognition in partially exposed environments."


The NASD Regulation Advanced-Detection System (ADS)

AI Magazine

The National Association of Securities Dealers, Inc., regulation advanced-detection system (ADS) monitors trades and quotations in The Nasdaq Stock Market to identify patterns and practices of behavior of potential regulatory interest. ADS has been in operational use at NASD Regulation since the summer of 1997 by several groups of analysts, processing approximately 2 million transactions a day, generating over 10,000 breaks. More important, it has greatly expanded surveillance coverage to new areas of the market and to many new types of behavior of regulatory concern. ADS combines detection and discovery components in a single system that supports multiple regulatory domains and shares the same market data. ADS makes use of a variety of AI techniques, including visualization, pattern recognition, and data mining, in support of the activities of regulatory analysis, alert and pattern detection, and knowledge discovery.