AITopics

Bajer, Lukas, Holena, Martin

Model Guided Sampling Optimization for Low-dimensional Problems

arXiv.org Machine LearningAug-31-2015

Optimization of very expensive black-box functions requires utilization of maximum information gathered by the process of optimization. Model Guided Sampling Optimization (MGSO) forms a more robust alternative to Jones' Gaussian-process-based EGO algorithm. Instead of EGO's maximizing expected improvement, the MGSO uses sampling the probability of improvement which is shown to be helpful against trapping in local minima. Further, the MGSO can reach close-to-optimum solutions faster than standard optimization algorithms on low dimensional or smooth problems.

artificial intelligence, evolutionary algorithm, optimization problem, (16 more...)

1508.07741

Country: Europe > Czechia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)

Thomas, Albert, Feuillard, Vincent, Gramfort, Alexandre

Calibration of One-Class SVM for MV set estimation

arXiv.org Machine LearningAug-30-2015

A general approach for anomaly detection or novelty detection consists in estimating high density regions or Minimum Volume (MV) sets. The One-Class Support Vector Machine (OCSVM) is a state-of-the-art algorithm for estimating such regions from high dimensional data. Yet it suffers from practical limitations. When applied to a limited number of samples it can lead to poor performance even when picking the best hyperparameters. Moreover the solution of OCSVM is very sensitive to the selection of hyperparameters which makes it hard to optimize in an unsupervised setting. We present a new approach to estimate MV sets using the OCSVM with a different choice of the parameter controlling the proportion of outliers. The solution function of the OCSVM is learnt on a training set and the desired probability mass is obtained by adjusting the offset on a test set to prevent overfitting. Models learnt on different train/test splits are then aggregated to reduce the variance induced by such random splits. Our approach makes it possible to tune the hyperparameters automatically and obtain nested set estimates. Experimental results show that our approach outperforms the standard OCSVM formulation while suffering less from the curse of dimensionality than kernel density estimates. Results on actual data sets are also presented.

artificial intelligence, data mining, ocsvm, (18 more...)

1508.07535

Country:

Europe > France (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)

Zayyani, Hadi, Korki, Mehdi, Marvasti, Farrokh

Dictionary Learning for Blind One Bit Compressed Sensing

arXiv.org Machine LearningAug-30-2015

This letter proposes a dictionary learning algorithm for blind one bit compressed sensing. In the blind one bit compressed sensing framework, the original signal to be reconstructed from one bit linear random measurements is sparse in an unknown domain. In this context, the multiplication of measurement matrix $\Ab$ and sparse domain matrix $\Phi$, \ie $\Db=\Ab\Phi$, should be learned. Hence, we use dictionary learning to train this matrix. Towards that end, an appropriate continuous convex cost function is suggested for one bit compressed sensing and a simple steepest-descent method is exploited to learn the rows of the matrix $\Db$. Experimental results show the effectiveness of the proposed algorithm against the case of no dictionary learning, specially with increasing the number of training signals and the number of sign measurements.

algorithm, artificial intelligence, data mining, (14 more...)

doi: 10.1109/LSP.2015.2503804

1508.07648

Country: Asia > Middle East > Iran (0.28)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningAug-30-2015

Stabilized Nearest Neighbor Classifier and Its Statistical Properties

Sun, Wei, Qiao, Xingye, Cheng, Guang

The stability of statistical analysis is an important indicator for reproducibility, which is one main principle of scientific method. It entails that similar statistical conclusions can be reached based on independent samples from the same underlying population. In this paper, we introduce a general measure of classification instability (CIS) to quantify the sampling variability of the prediction made by a classification method. Interestingly, the asymptotic CIS of any weighted nearest neighbor classifier turns out to be proportional to the Euclidean norm of its weight vector. Based on this concise form, we propose a stabilized nearest neighbor (SNN) classifier, which distinguishes itself from other nearest neighbor classifiers, by taking the stability into consideration. In theory, we prove that SNN attains the minimax optimal convergence rate in risk, and a sharp convergence rate in CIS. The latter rate result is established for general plug-in classifiers under a low-noise condition. Extensive simulated and real examples demonstrate that SNN achieves a considerable improvement in CIS over existing nearest neighbor classifiers, with comparable classification accuracy. We implement the algorithm in a publicly available R package snn.

artificial intelligence, classifier, health & medicine, (18 more...)

1405.6642

Country:

North America > United States > California (0.28)
North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Reverdy, Paul, Leonard, Naomi E.

Parameter estimation in softmax decision-making models with linear objective functions

arXiv.org Machine LearningAug-29-2015

With an eye towards human-centered automation, we contribute to the development of a systematic means to infer features of human decision-making from behavioral data. Motivated by the common use of softmax selection in models of human decision-making, we study the maximum likelihood parameter estimation problem for softmax decision-making models with linear objective functions. We present conditions under which the likelihood function is convex. These allow us to provide sufficient conditions for convergence of the resulting maximum likelihood estimator and to construct its asymptotic distribution. In the case of models with nonlinear objective functions, we show how the estimator can be applied by linearizing about a nominal parameter value. We apply the estimator to fit the stochastic UCL (Upper Credible Limit) model of human decision-making to human subject data. We show statistically significant differences in behavior across related, but distinct, tasks.

air transportation, optimization problem, parameter estimate, (20 more...)

1502.04635

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Regularized Kernel Recursive Least Square Algoirthm

Zhao, Songlin

In most adaptive signal processing applications, system linearity is assumed and adaptive linear filters are thus used. The traditional class of supervised adaptive filters rely on error-correction learning for their adaptive capability. The kernel method is a powerful nonparametric modeling tool for pattern analysis and statistical signal processing. Through a nonlinear mapping, kernel methods transform the data into a set of points in a Reproducing Kernel Hilbert Space. KRLS achieves high accuracy and has fast convergence rate in stationary scenario. However the good performance is obtained at a cost of high computation complexity. Sparsification in kernel methods is know to related to less computational complexity and memory consumption.

adaptive filter, artificial intelligence, machine learning, (15 more...)

1508.07103

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.76)

Ramdas, Aaditya, Tibshirani, Ryan J.

Fast and Flexible ADMM Algorithms for Trend Filtering

This paper presents a fast and robust algorithm for trend filtering, a recently developed nonparametric regression tool. It has been shown that, for estimating functions whose derivatives are of bounded variation, trend filtering achieves the minimax optimal error rate, while other popular methods like smoothing splines and kernels do not. Standing in the way of a more widespread practical adoption, however, is a lack of scalable and numerically stable algorithms for fitting trend filtering estimates. This paper presents a highly efficient, specialized ADMM routine for trend filtering. Our algorithm is competitive with the specialized interior point methods that are currently in use, and yet is far more numerically robust. Furthermore, the proposed ADMM implementation is very simple, and importantly, it is flexible enough to extend to many interesting related problems, such as sparse trend filtering and isotonic trend filtering. Software for our method is freely available, in both the C and R languages.

algorithm, artificial intelligence, optimization problem, (18 more...)

doi: 10.1080/10618600.2015.1054033

1406.2082

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

A Review of Nonnegative Matrix Factorization Methods for Clustering

Türkmen, Ali Caner

Nonnegative Matrix Factorization (NMF) was first introduced as a low-rank matrix approximation technique, and has enjoyed a wide area of applications. Although NMF does not seem related to the clustering problem at first, it was shown that they are closely linked. In this report, we provide a gentle introduction to clustering and NMF before reviewing the theoretical relationship between them. We then explore several NMF variants, namely Sparse NMF, Projective NMF, Nonnegative Spectral Clustering and Cluster-NMF, along with their clustering interpretations.

health & medicine, matrix, survey article, (18 more...)

1507.03194

Country:

North America > United States (0.28)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Huang, Yanping, Zhang, Sai

Partitioning Large Scale Deep Belief Networks Using Dropout

Deep learning methods have shown great promise in many practical applications, ranging from speech recognition, visual object recognition, to text processing. However, most of the current deep learning methods suffer from scalability problems for large-scale applications, forcing researchers or users to focus on small-scale problems with fewer parameters. In this paper, we consider a well-known machine learning model, deep belief networks (DBNs) that have yielded impressive classification performance on a large number of benchmark machine learning tasks. To scale up DBN, we propose an approach that can use the computing clusters in a distributed environment to train large models, while the dense matrix computations within a single machine are sped up using graphics processors (GPU). When training a DBN, each machine randomly drops out a portion of neurons in each hidden layer, for each training case, making the remaining neurons only learn to detect features that are generally helpful for producing the correct answer. Within our approach, we have developed four methods to combine outcomes from each machine to form a unified model. Our preliminary experiment on the mnst handwritten digit database demonstrates that our approach outperforms the state of the art test error rate.

algorithm, deep learning, neural network, (19 more...)

1508.07096

Country: North America > United States (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.62)