Country
A Formulation for Minimax Probability Machine Regression
Strohmann, Thomas, Grudic, Gregory Z.
We formulate the regression problem as one of maximizing the minimum probability,symbolized by ฮฉ, that future predicted outputs of the regression model will be within some ฮต bound of the true regression function. Our formulation is unique in that we obtain a direct estimate of this lower probability bound ฮฉ. The proposed framework, minimax probability machine regression (MPMR), is based on the recently described minimaxprobability machine classification algorithm [Lanckriet et al.] and uses Mercer Kernels to obtain nonlinear regression models. MPMR is tested on both toy and real world data, verifying the accuracy of the ฮฉ bound, and the efficacy of the regression models.
One-Class LP Classifiers for Dissimilarity Representations
Pekalska, Elzbieta, Tax, David M.J., Duin, Robert
Problems in which abnormal or novel situations should be detected can be approached by describing the domain of the class of typical examples. Theseapplications come from the areas of machine diagnostics, fault detection, illness identification or, in principle, refer to any problem wherelittle knowledge is available outside the typical class. In this paper we explain why proximities are natural representations for domain descriptors and we propose a simple one-class classifier for dissimilarity representations. By the use of linear programming an efficient one-class description can be found, based on a small number of prototype objects. This classifier can be made (1) more robust by transforming the dissimilarities and(2) cheaper to compute by using a reduced representation set. Finally, a comparison to a comparable one-class classifier by Campbell and Bennett is given.
Adaptive Classification by Variational Kalman Filtering
Sykacek, Peter, Roberts, Stephen J.
We propose in this paper a probabilistic approach for adaptive inference of generalized nonlinear classification that combines the computational advantage of a parametric solution with the flexibility of sequential sampling techniques.We regard the parameters of the classifier as latent states in a first order Markov process and propose an algorithm which can be regarded as variational generalization of standard Kalman filtering. Thevariational Kalman filter is based on two novel lower bounds that enable us to use a non-degenerate distribution over the adaptation rate. An extensive empirical evaluation demonstrates that the proposed method is capable of infering competitive classifiers both in stationary and non-stationary environments. Although we focus on classification, the algorithm is easily extended to other generalized nonlinear models.
Parametric Mixture Models for Multi-Labeled Text
We propose probabilistic generative models, called parametric mixture models(PMMs), for multiclass, multi-labeled text categorization problem.Conventionally, the binary classification approach has been employed, in which whether or not text belongs to a category isjudged by the binary classifier for every category. In contrast, our approach can simultaneously detect multiple categories of text using PMMs. We derive efficient learning and prediction algorithms forPMMs. We also empirically show that our method could significantly outperform the conventional binary methods when applied tomulti-labeled text categorization using real World Wide Web pages.
Global Versus Local Methods in Nonlinear Dimensionality Reduction
Silva, Vin D., Tenenbaum, Joshua B.
Recently proposed algorithms for nonlinear dimensionality reduction fall broadly into two categories which have different advantages and disadvantages: global(Isomap [1]), and local (Locally Linear Embedding [2], Laplacian Eigenmaps [3]). We present two variants of Isomap which combine the advantages of the global approach with what have previously beenexclusive advantages of local methods: computational sparsity and the ability to invert conformal maps.
Half-Lives of EigenFlows for Spectral Clustering
Chennubhotla, Chakra, Jepson, Allan D.
Using a Markov chain perspective of spectral clustering we present an algorithm to automatically find the number of stable clusters in a dataset. The Markov chain's behaviour is characterized by the spectral properties of the matrix of transition probabilities, from which we derive eigenflows along with their halflives. An eigenflow describes the flow of probability massdue to the Markov chain, and it is characterized by its eigenvalue, orequivalently, by the halflife of its decay as the Markov chain is iterated. A ideal stable cluster is one with zero eigenflow and infinite half-life.The key insight in this paper is that bottlenecks between weakly coupled clusters can be identified by computing the sensitivity of the eigenflow's halflife to variations in the edge weights. We propose a novel EIGENCUTS algorithm to perform clustering that removes these identified bottlenecks in an iterative fashion.
Intrinsic Dimension Estimation Using Packing Numbers
We propose a new algorithm to estimate the intrinsic dimension of data sets. The method is based on geometric properties of the data and requires neitherparametric assumptions on the data generating model nor input parameters to set. The method is compared to a similar, widelyused algorithmfrom the same family of geometric techniques. Experiments showthat our method is more robust in terms of the data generating distribution and more reliable in the presence of noise.
Self Supervised Boosting
Welling, Max, Zemel, Richard S., Hinton, Geoffrey E.
Boosting algorithms and successful applications thereof abound for classification andregression learning problems, but not for unsupervised learning. We propose a sequential approach to adding features to a random fieldmodel by training them to improve classification performance between the data and an equal-sized sample of "negative examples" generated fromthe model's current estimate of the data density.
Independent Components Analysis through Product Density Estimation
Hastie, Trevor, Tibshirani, Rob
We present a simple direct approach for solving the ICA problem, using density estimation and maximum likelihood. Given a candidate orthogonalframe, we model each of the coordinates using a semi-parametric density estimate based on cubic splines. Since our estimates have two continuous derivatives, we can easily run a second ordersearch for the frame parameters. Our method performs very favorably when compared to state-of-the-art techniques. 1 Introduction Independent component analysis (ICA) is a popular enhancement over principal component analysis (PCA) and factor analysis. IRP which is assumed to arise from a linear mixing of a latent random source vector S E IRP, (1) X AS; the components Sj, j 1, ...,p of S are assumed to be independently distributed.
Boosting Density Estimation
Several authors have suggested viewing boosting as a gradient descent search for a good fit in function space. We apply gradient-based boosting methodology to the unsupervised learning problem of density estimation. We show convergence properties of the algorithm and prove that a strength of weak learnability property appliesto this problem as well. We illustrate the potential of this approach through experiments with boosting Bayesian networks to learn density models.