Plotting

 Technology


Lazy Learning Meets the Recursive Least Squares Algorithm

Neural Information Processing Systems

Lazy learning is a memory-based technique that, once a query is received, extracts a prediction interpolating locally the neighboring examples of the query which are considered relevant according to a distance measure. In this paper we propose a data-driven method to select on a query-by-query basis the optimal number of neighbors to be considered for each prediction. As an efficient way to identify and validate local models, the recursive least squares algorithm is introduced in the context of local approximation and lazy learning. Furthermore, beside the winner-takes-all strategy for model selection, a local combination of the most promising models is explored. The method proposed is tested on six different datasets and compared with a state-of-the-art approach.



Optimizing Classifers for Imbalanced Training Sets

Neural Information Processing Systems

Following recent results [9, 8] showing the importance of the fatshattering dimension in explaining the beneficial effect of a large margin on generalization performance, the current paper investigates the implications of these results for the case of imbalanced datasets and develops two approaches to setting the threshold. The approaches are incorporated into ThetaBoost, a boosting algorithm for dealing with unequal loss functions. The performance of ThetaBoost and the two approaches are tested experimentally.


Global Optimisation of Neural Network Models via Sequential Sampling

Neural Information Processing Systems

We propose a novel strategy for training neural networks using sequential sampling-importance resampling algorithms. This global optimisation strategy allows us to learn the probability distribution of the network weights in a sequential framework. It is well suited to applications involving online, nonlinear, non-Gaussian or non-stationary signal processing. 1 INTRODUCTION This paper addresses sequential training of neural networks using powerful sampling techniques. Sequential techniques are important in many applications of neural networks involving real-time signal processing, where data arrival is inherently sequential. Furthermore, one might wish to adopt a sequential training strategy to deal with non-stationarity in signals, so that information from the recent past is lent more credence than information from the distant past. One way to sequentially estimate neural network models is to use a state space formulation and the extended Kalman filter (Singhal and Wu 1988, de Freitas, Niranjan and Gee 1998).


Call-Based Fraud Detection in Mobile Communication Networks Using a Hierarchical Regime-Switching Model

Neural Information Processing Systems

Fraud causes substantial losses to telecommunication carriers. Detection systems which automatically detect illegal use of the network can be used to alleviate the problem. Previous approaches worked on features derived from the call patterns of individual users. In this paper we present a call-based detection system based on a hierarchical regime-switching model. The detection problem is formulated as an inference problem on the regime probabilities.


Graph Matching for Shape Retrieval

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


Classification in Non-Metric Spaces

Neural Information Processing Systems

A key question in vision is how to represent our knowledge of previously encountered objects to classify new ones. The answer depends on how we determine the similarity of two objects. Similarity tells us how relevant each previously seen object is in determining the category to which a new object belongs.


Perceiving without Learning: From Spirals to Inside/Outside Relations

Neural Information Processing Systems

As a benchmark task, the spiral problem is well known in neural networks. Unlike previous work that emphasizes learning, we approach the problem from a generic perspective that does not involve learning. We point out that the spiral problem is intrinsically connected to the inside/outside problem. A generic solution to both problems is proposed based on oscillatory correlation using a time delay network. Our simulation results are qualitatively consistent with human performance, and we interpret human limitations in terms of synchrony and time delays, both biologically plausible. As a special case, our network without time delays can always distinguish these figures regardless of shape, position, size, and orientation.


Learning Mixture Hierarchies

Neural Information Processing Systems

The hierarchical representation of data has various applications in domains such as data mining, machine vision, or information retrieval. In this paper we introduce an extension of the Expectation-Maximization (EM) algorithm that learns mixture hierarchies in a computationally efficient manner. Efficiency is achieved by progressing in a bottom-up fashion, i.e. by clustering the mixture components of a given level in the hierarchy to obtain those of the level above. This cl ustering requires onl y knowledge of the mixture parameters, there being no need to resort to intermediate samples. In addition to practical applications, the algorithm allows a new interpretation of EM that makes clear the relationship with nonparametric kernel-based estimation methods, provides explicit control over the tradeoff between the bias and variance of EM estimates, and offers new insights about the behavior of deterministic annealing methods commonly used with EM to escape local minima of the likelihood.


SMEM Algorithm for Mixture Models

Neural Information Processing Systems

We present a split and merge EM (SMEM) algorithm to overcome the local maximum problem in parameter estimation of finite mixture models. In the case of mixture models, non-global maxima often involve having too many components of a mixture model in one part of the space and too few in another, widely separated part of the space. To escape from such configurations we repeatedly perform simultaneous split and merge operations using a new criterion for efficiently selecting the split and merge candidates. We apply the proposed algorithm to the training of Gaussian mixtures and mixtures of factor analyzers using synthetic and real data and show the effectiveness of using the split and merge operations to improve the likelihood of both the training data and of held-out test data. 1 INTRODUCTION Mixture density models, in particular normal mixtures, have been extensively used in the field of statistical pattern recognition [1]. Recently, more sophisticated mixture density models such as mixtures of latent variable models (e.g., probabilistic PCA or factor analysis) have been proposed to approximate the underlying data manifold [2]-[4].