Goto

Collaborating Authors

 Uncertainty


Dictionary Learning Strategies for Compressed Fiber Sensing Using a Probabilistic Sparse Model

arXiv.org Machine Learning

We present a sparse estimation and dictionary learning framework for compressed fiber sensing based on a probabilistic hierarchical sparse model. To handle severe dictionary coherence, selective shrinkage is achieved using a Weibull prior, which can be related to non-convex optimization with $p$-norm constraints for $0 < p < 1$. In addition, we leverage the specific dictionary structure to promote collective shrinkage based on a local similarity model. This is incorporated in form of a kernel function in the joint prior density of the sparse coefficients, thereby establishing a Markov random field-relation. Approximate inference is accomplished using a hybrid technique that combines Hamilton Monte Carlo and Gibbs sampling. To estimate the dictionary parameter, we pursue two strategies, relying on either a deterministic or a probabilistic model for the dictionary parameter. In the first strategy, the parameter is estimated based on alternating estimation. In the second strategy, it is jointly estimated along with the sparse coefficients. The performance is evaluated in comparison to an existing method in various scenarios using simulations and experimental data.


Robust training on approximated minimal-entropy set

arXiv.org Machine Learning

Large margin classifiers, such as the support vector machine (SVM) [1] and the maximum entropy discrimination (MED) classifier [2], have enjoyed great popularity in the signal processing and machine learning communities due to their broad applicability, robust performance, and the availability of fast software implementations. When the training data is representative of the test data, the performance of MED/SVM has theoretical guarantees that have been validated in practice [1], [3], [4]. Moreover, since the decision boundary of the MED/SVM is solely defined by a few support vectors, the algorithm can tolerate random feature distortions and perturbations. However, in many real applications, anomalous measurements are inherent to the data set due to strong environmental noise or possible sensor failures. Such anomalies arise in industrial process monitoring, video surveillance, tactical multimodal sensing, robust spectrum sensing [5], [6], and, more generally, any application that involves unattended sensors in difficult environments (Figure 1).


Robust Bayesian Compressed sensing

arXiv.org Machine Learning

We consider the problem of robust compressed sensing whose objective is to recover a high-dimensional sparse signal from compressed measurements corrupted by outliers. A new sparse Bayesian learning method is developed for robust compressed sensing. The basic idea of the proposed method is to identify and remove the outliers from sparse signal recovery. To automatically identify the outliers, we employ a set of binary indicator hyperparameters to indicate which observations are outliers. These indicator hyperparameters are treated as random variables and assigned a beta process prior such that their values are confined to be binary. In addition, a Gaussian-inverse Gamma prior is imposed on the sparse signal to promote sparsity. Based on this hierarchical prior model, we develop a variational Bayesian method to estimate the indicator hyperparameters as well as the sparse signal. Simulation results show that the proposed method achieves a substantial performance improvement over existing robust compressed sensing techniques.


Hybrid clustering-classification neural network in the medical diagnostics of reactive arthritis

arXiv.org Machine Learning

Self-organizing maps (SOM) and neural networks of learning vector quantization (LVQ) have seen extensive use for solving different problems in Data Mining domain (clustering, classification, fault detection and compression of information etc.). This type of neural networks was proposed by T. Kohonen [1, 2] and represents, in fact, a single-layer feedforward architecture, which provides an operator for mapping of input space into the output space. Operation-wise SOM and LVQ are quite similar to each neuron is fed input signal (sample) producing output, which is used during competition stage to determine winning neuron - usually the one with maximum output signal value. Vector of synaptic weights for winning neuron is the one closest to the input sample in terms of the metric chosen (which is Euclidian metric in most cases). Next is neurons adjustment phase.


Generalized Interval-valued OWA Operators with Interval Weights Derived from Interval-valued Overlap Functions

arXiv.org Artificial Intelligence

In this work we extend to the interval-valued setting the notion of an overlap functions and we discuss a method which makes use of interval-valued overlap functions for constructing OWA operators with interval-valued weights. . Some properties of intervalvalued overlap functions and the derived interval-valued OWA operators are analysed. We specially focus on the homogeneity and migrativity properties. Keywords Interval-valued fuzzy sets interval-valued overlap functions Interval-valued overlap OWA operators interval weighted vector migrativity homogeneity 1 Introduction Interval-valued fuzzy sets [62] have been succesfully applied in many different problems. Just to mention some of the most recent ones, interval-valued fuzzy sets have been used in decision making(see, e.g., theworksbyKhalilandHassan[36]andChengetal. They have also been the origin of rich theoretical studies, as, for instance, the works by Bedregal et al. [3, 7], Dimuro et al. [28], Reiser et al. [48] and the recent works by Zywica et al. [64] and Takรกc [55]. From the point of view of applications, interval-valued fuzzy sets are a suitable tool to represent uncertain or incomplete information. In particular, the length of the intervalvalued membership degree of a given element can be understood as a measure of the lack of certainty of the expert for providing an exact membership value to that element [44].


Change-point Detection Methods for Body-Worn Video

arXiv.org Machine Learning

Body-worn video (BWV) cameras are increasingly utilized by police departments to provide a record of police-public interactions. However, large-scale BWV deployment produces terabytes of data per week, necessitating the development of effective computational methods to identify salient changes in video. In work carried out at the 2016 RIPS program at IPAM, UCLA, we present a novel two-stage framework for video change-point detection. First, we employ state-of-the-art machine learning methods including convolutional neural networks and support vector machines for scene classification. We then develop and compare change-point detection algorithms utilizing mean squared-error minimization, forecasting methods, hidden Markov models, and maximum likelihood estimation to identify noteworthy changes. We test our framework on detection of vehicle exits and entrances in a BWV data set provided by the Los Angeles Police Department and achieve over 90% recall and nearly 70% precision -- demonstrating robustness to rapid scene changes, extreme luminance differences, and frequent camera occlusions.


DOLDA - a regularized supervised topic model for high-dimensional multi-class regression

arXiv.org Machine Learning

During the last decades more and more textual data have become available, creating a growing need to statistically analyze large amounts of textual data. The hugely popular Latent Dirichlet Allocation (LDA) model introduced by Blei et al. (2003) is a generative probability model where each document is summarized by a set of latent semantic themes, often called topics; formally, a topic is a probability distribution over the vocabulary. An estimated LDA model is therefore a compressed latent representation of the documents. LDA is a mixed membership model where each document is a mixture of topics, where each word (token) in a document belongs to a single topic. The basic LDA model is unsupervised, i.e. the topics are learned solely from the words in the documents without access to document labels. In many situations there are also other information we would like to incorporate in modeling a corpus of documents. A common example is when we have labeled documents, such as ratings of movies together with a movie description, illness category in medical journals or the location of the identified bug together with bug reports. In these situation, one can use a so called supervised topic model to find the semantic structure in the documents that are related to the class of interest. One of the first approaches to supervised topic models was proposed by Mcauliffe and Blei (2008).


The Generalized Reparameterization Gradient

arXiv.org Machine Learning

The reparameterization gradient has become a widely used method to obtain Monte Carlo gradients to optimize the variational objective. However, this technique does not easily apply to commonly used distributions such as beta or gamma without further approximations, and most practical applications of the reparameterization gradient fit Gaussian distributions. In this paper, we introduce the generalized reparameterization gradient, a method that extends the reparameterization gradient to a wider class of variational distributions. Generalized reparameterizations use invertible transformations of the latent variables which lead to transformed distributions that weakly depend on the variational parameters. This results in new Monte Carlo gradients that combine reparameterization gradients and score function gradients. We demonstrate our approach on variational inference for two complex probabilistic models. The generalized reparameterization is effective: even a single sample from the variational distribution is enough to obtain a low-variance gradient.



Deep Amortized Inference for Probabilistic Programs

arXiv.org Machine Learning

Probabilistic programming languages (PPLs) are a powerful modeling tool, able to represent any computable probability distribution. Unfortunately, probabilistic program inference is often intractable, and existing PPLs mostly rely on expensive, approximate sampling-based methods. To alleviate this problem, one could try to learn from past inferences, so that future inferences run faster. This strategy is known as amortized inference; it has recently been applied to Bayesian networks and deep generative models. This paper proposes a system for amortized inference in PPLs. In our system, amortization comes in the form of a parameterized guide program. Guide programs have similar structure to the original program, but can have richer data flow, including neural network components. These networks can be optimized so that the guide approximately samples from the posterior distribution defined by the original program. We present a flexible interface for defining guide programs and a stochastic gradient-based scheme for optimizing guide parameters, as well as some preliminary results on automatically deriving guide programs. We explore in detail the common machine learning pattern in which a 'local' model is specified by 'global' random values and used to generate independent observed data points; this gives rise to amortized local inference supporting global model learning.