Not enough data to create a plot.
Try a different view from the menu above.
Country
Multiple testing, uncertainty and realistic pictures
Langovoy, Mikhail A., Wittich, Olaf
We study statistical detection of grayscale objects in noisy images. The object of interest is of unknown shape and has an unknown intensity, that can be varying over the object and can be negative. No boundary shape constraints are imposed on the object, only a weak bulk condition for the object's interior is required. We propose an algorithm that can be used to detect grayscale objects of unknown shapes in the presence of nonparametric noise of unknown level. Our algorithm is based on a nonparametric multiple testing procedure. We establish the limit of applicability of our method via an explicit, closed-form, non-asymptotic and nonparametric consistency bound. This bound is valid for a wide class of nonparametric noise distributions. We achieve this by proving an uncertainty principle for percolation on finite lattices.
Improved RIP Analysis of Orthogonal Matching Pursuit
Orthogonal Matching Pursuit (OMP) has long been considered a powerful heuristic for attacking compressive sensing problems; however, its theoretical development is, unfortunately, somewhat lacking. This paper presents an improved Restricted Isometry Property (RIP) based performance guarantee for T-sparse signal reconstruction that asymptotically approaches the conjectured lower bound given in Davenport et al. We also further extend the state-of-the-art by deriving reconstruction error bounds for the case of general non-sparse signals subjected to measurement noise. We then generalize our results to the case of K-fold Orthogonal Matching Pursuit (KOMP). We finish by presenting an empirical analysis suggesting that OMP and KOMP outperform other compressive sensing algorithms in average case scenarios. This turns out to be quite surprising since RIP analysis (i.e. worst case scenario) suggests that these matching pursuits should perform roughly T^0.5 times worse than convex optimization, CoSAMP, and Iterative Thresholding.
Active Clustering: Robust and Efficient Hierarchical Clustering using Adaptively Selected Similarities
Eriksson, Brian, Dasarathy, Gautam, Singh, Aarti, Nowak, Robert
Hierarchical clustering based on pairwise similarities is a common tool used in a broad range of scientific applications. However, in many problems it may be expensive to obtain or compute similarities between the items to be clustered. This paper investigates the hierarchical clustering of N items based on a small subset of pairwise similarities, significantly less than the complete set of N(N-1)/2 similarities. First, we show that if the intracluster similarities exceed intercluster similarities, then it is possible to correctly determine the hierarchical clustering from as few as 3N log N similarities. We demonstrate this order of magnitude savings in the number of pairwise similarities necessitates sequentially selecting which similarities to obtain in an adaptive fashion, rather than picking them at random. We then propose an active clustering method that is robust to a limited fraction of anomalous similarities, and show how even in the presence of these noisy similarity values we can resolve the hierarchical clustering using only O(N log^2 N) pairwise similarities.
Computationally Efficient Modulation Level Classification Based on Probability Distribution Distance Functions
Urriza, Paulo, Rebeiz, Eric, Pawełczak, Przemysław, Čabrić, Danijela
We present a novel modulation level classification (MLC) method based on probability distribution distance functions. The proposed method uses modified Kuiper and Kolmogorov-Smirnov distances to achieve low computational complexity and outperforms the state of the art methods based on cumulants and goodness-of-fit tests. We derive the theoretical performance of the proposed MLC method and verify it via simulations. The best classification accuracy, under AWGN with SNR mismatch and phase jitter, is achieved with the proposed MLC method using Kuiper distances.
Inferring Disease and Gene Set Associations with Rank Coherence in Networks
Hwang, TaeHyun, Zhang, Wei, Xie, Maoqiang, Kuang, Rui
A computational challenge to validate the candidate disease genes identified in a high-throughput genomic study is to elucidate the associations between the set of candidate genes and disease phenotypes. The conventional gene set enrichment analysis often fails to reveal associations between disease phenotypes and the gene sets with a short list of poorly annotated genes, because the existing annotations of disease causative genes are incomplete. We propose a network-based computational approach called rcNet to discover the associations between gene sets and disease phenotypes. Assuming coherent associations between the genes ranked by their relevance to the query gene set, and the disease phenotypes ranked by their relevance to the hidden target disease phenotypes of the query gene set, we formulate a learning framework maximizing the rank coherence with respect to the known disease phenotype-gene associations. An efficient algorithm coupling ridge regression with label propagation, and two variants are introduced to find the optimal solution of the framework. We evaluated the rcNet algorithms and existing baseline methods with both leave-one-out cross-validation and a task of predicting recently discovered disease-gene associations in OMIM. The experiments demonstrated that the rcNet algorithms achieved the best overall rankings compared to the baselines. To further validate the reproducibility of the performance, we applied the algorithms to identify the target diseases of novel candidate disease genes obtained from recent studies of GWAS, DNA copy number variation analysis, and gene expression profiling. The algorithms ranked the target disease of the candidate genes at the top of the rank list in many cases across all the three case studies. The rcNet algorithms are available as a webtool for disease and gene set association analysis at http://compbio.cs.umn.edu/dgsa_rcNet.
Evolved preambles for MAX-SAT heuristics
Rigo, Luis O. Jr, Barbosa, Valmir C.
MAX-SAT heuristics normally operate from random initial truth assignments to the variables. We consider the use of what we call preambles, which are sequences of variables with corresponding single-variable assignment actions intended to be used to determine a more suitable initial truth assignment for a given problem instance and a given heuristic. For a number of well established MAX-SAT heuristics and benchmark instances, we demonstrate that preambles can be evolved by a genetic algorithm such that the heuristics are outperformed in a significant fraction of the cases.
Differentially Private Empirical Risk Minimization
Chaudhuri, Kamalika, Monteleoni, Claire, Sarwate, Anand D.
Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the $\epsilon$-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM classification. Then we propose a new method, objective perturbation, for privacy-preserving machine learning algorithm design. This method entails perturbing the objective function before optimizing over classifiers. If the loss and regularizer satisfy certain convexity and differentiability criteria, we prove theoretical results showing that our algorithms preserve privacy, and provide generalization bounds for linear and nonlinear kernels. We further present a privacy-preserving technique for tuning the parameters in general machine learning algorithms, thereby providing end-to-end privacy guarantees for the training process. We apply these results to produce privacy-preserving analogues of regularized logistic regression and support vector machines. We obtain encouraging results from evaluating their performance on real demographic and benchmark data sets. Our results show that both theoretically and empirically, objective perturbation is superior to the previous state-of-the-art, output perturbation, in managing the inherent tradeoff between privacy and learning performance.
Online EM Algorithm for Hidden Markov Models
Online (also called "recursive" or "adaptive") estimation of fixed model parameters in hidden Markov models is a topic of much interest in times series modelling. In this work, we propose an online parameter estimation algorithm that combines two key ideas. The first one, which is deeply rooted in the Expectation-Maximization (EM) methodology consists in reparameterizing the problem using complete-data sufficient statistics. The second ingredient consists in exploiting a purely recursive form of smoothing in HMMs based on an auxiliary recursion. Although the proposed online EM algorithm resembles a classical stochastic approximation (or Robbins-Monro) algorithm, it is sufficiently different to resist conventional analysis of convergence. We thus provide limited results which identify the potential limiting points of the recursion as well as the large-sample behavior of the quantities involved in the algorithm. The performance of the proposed algorithm is numerically evaluated through simulations in the case of a noisily observed Markov chain. In this case, the algorithm reaches estimation results that are comparable to that of the maximum likelihood estimator for large sample sizes.
Decision Theory with Prospect Interference and Entanglement
We present a novel variant of decision making based on the mathematical theory of separable Hilbert spaces. This mathematical structure captures the effect of superposition of composite prospects, including many incorporated intentions, which allows us to describe a variety of interesting fallacies and anomalies that have been reported to particularize the decision making of real human beings. The theory characterizes entangled decision making, non-commutativity of subsequent decisions, and intention interference. We demonstrate how the violation of the Savage's sure-thing principle, known as the disjunction effect, can be explained quantitatively as a result of the interference of intentions, when making decisions under uncertainty. The disjunction effects, observed in experiments, are accurately predicted using a theorem on interference alternation that we derive, which connects aversion-to-uncertainty to the appearance of negative interference terms suppressing the probability of actions. The conjunction fallacy is also explained by the presence of the interference terms. A series of experiments are analysed and shown to be in excellent agreement with a priori evaluation of interference effects. The conjunction fallacy is also shown to be a sufficient condition for the disjunction effect and novel experiments testing the combined interplay between the two effects are suggested.
Feature Selection via Sparse Approximation for Face Recognition
Liang, Yixiong, Wang, Lei, Xiang, Yao, Zou, Beiji
Inspired by biological vision systems, the over-complete local features with huge cardinality are increasingly used for face recognition during the last decades. Accordingly, feature selection has become more and more important and plays a critical role for face data description and recognition. In this paper, we propose a trainable feature selection algorithm based on the regularized frame for face recognition. By enforcing a sparsity penalty term on the minimum squared error (MSE) criterion, we cast the feature selection problem into a combinatorial sparse approximation problem, which can be solved by greedy methods or convex relaxation methods. Moreover, based on the same frame, we propose a sparse Ho-Kashyap (HK) procedure to obtain simultaneously the optimal sparse solution and the corresponding margin vector of the MSE criterion. The proposed methods are used for selecting the most informative Gabor features of face images for recognition and the experimental results on benchmark face databases demonstrate the effectiveness of the proposed methods.