Goto

Collaborating Authors

 United States


Generalization Bounds for Metric and Similarity Learning

arXiv.org Machine Learning

Recently, metric learning and similarity learning have attracted a large amount of interest. Many models and optimisation algorithms have been proposed. However, there is relatively little work on the generalization analysis of such methods. In this paper, we derive novel generalization bounds of metric and similarity learning. In particular, we first show that the generalization analysis reduces to the estimation of the Rademacher average over "sums-of-i.i.d." sample-blocks related to the specific matrix norm. Then, we derive generalization bounds for metric/similarity learning with different matrix-norm regularisers by estimating their specific Rademacher complexities. Our analysis indicates that sparse metric/similarity learning with $L^1$-norm regularisation could lead to significantly better bounds than those with Frobenius-norm regularisation. Our novel generalization analysis develops and refines the techniques of U-statistics and Rademacher complexity analysis.


Deep Predictive Coding Networks

arXiv.org Machine Learning

The quality of data representation in deep learning methods is directly related to the prior model imposed on the representations; however, generally used fixed priors are not capable of adjusting to the context in the data. To address this issue, we propose deep predictive coding networks, a hierarchical generative model that empirically alters priors on the latent representations in a dynamic and context-sensitive manner. This model captures the temporal dependencies in time-varying signals and uses top-down information to modulate the representation in lower layers. The centerpiece of our model is a novel procedure to infer sparse states of a dynamic model which is used for feature extraction. We also extend this feature extraction block to introduce a pooling function that captures locally invariant representations. When applied on a natural video data, we show that our method is able to learn high-level visual features. We also demonstrate the role of the top-down connections by showing the robustness of the proposed model to structured noise.


Variational Semi-blind Sparse Deconvolution with Orthogonal Kernel Bases and its Application to MRFM

arXiv.org Machine Learning

We present a variational Bayesian method of joint image reconstruction and point spread function (PSF) estimation when the PSF of the imaging device is only partially known. To solve this semi-blind deconvolution problem, prior distributions are specified for the PSF and the 3D image. Joint image reconstruction and PSF estimation is then performed within a Bayesian framework, using a variational algorithm to estimate the posterior distribution. The image prior distribution imposes an explicit atomic measure that corresponds to image sparsity. Importantly, the proposed Bayesian deconvolution algorithm does not require hand tuning. Simulation results clearly demonstrate that the semi-blind deconvolution algorithm compares favorably with previous Markov chain Monte Carlo (MCMC) version of myopic sparse reconstruction. It significantly outperforms mismatched non-blind algorithms that rely on the assumption of the perfect knowledge of the PSF. The algorithm is illustrated on real data from magnetic resonance force microscopy (MRFM).


Another Look at Quantum Neural Computing

arXiv.org Artificial Intelligence

The term quantum neural computing indicates a unity in the functioning of the brain. It assumes that the neural structures perform classical processing and that the virtual particles associated with the dynamical states of the structures define the underlying quantum state. We revisit the concept and also summarize new arguments related to the learning modes of the brain in response to sensory input that may be aggregated in three types: associative, reorganizational, and quantum. The associative and reorganizational types are quite apparent based on experimental findings; it is much harder to establish that the brain as an entity exhibits quantum properties. We argue that the reorganizational behavior of the brain may be viewed as inner adjustment corresponding to its quantum behavior at the system level. Not only neural structures but their higher abstractions also may be seen as whole entities. We consider the dualities associated with the behavior of the brain and how these dualities are bridged.


A computational scheme for Reasoning in Dynamic Probabilistic Networks

arXiv.org Artificial Intelligence

A computational scheme for reasoning about dynamic systems using (causal) probabilistic networks is presented. The scheme is based on the framework of Lauritzen and Spiegelhalter (1988), and may be viewed as a generalization of the inference methods of classical time-series analysis in the sense that it allows description of non-linear, multivariate dynamic systems with complex conditional independence structures. Further, the scheme provides a method for efficient backward smoothing and possibilities for efficient, approximate forecasting methods. The scheme has been implemented on top of the HUGIN shell.


Structural Controllability and Observability in Influence Diagrams

arXiv.org Artificial Intelligence

Influence diagram is a graphical representation of belief networks with uncertainty. This article studies the structural properties of a probabilistic model in an influence diagram. In particular, structural controllability theorems and structural observability theorems are developed and algorithms are formulated. Controllability and observability are fundamental concepts in dynamic systems (Luenberger 1979). Controllability corresponds to the ability to control a system while observability analyzes the inferability of its variables. Both properties can be determined by the ranks of the system matrices. Structural controllability and observability, on the other hand, analyze the property of a system with its structure only, without the specific knowledge of the values of its elements (tin 1974, Shields and Pearson 1976). The structural analysis explores the connection between the structure of a model and the functional dependence among its elements. It is useful in comprehending problem and formulating solution by challenging the underlying intuitions and detecting inconsistency in a model. This type of qualitative reasoning can sometimes provide insight even when there is insufficient numerical information in a model.


Semantics for Probabilistic Inference

arXiv.org Artificial Intelligence

A number of writers(Joseph Halpern and Fahiem Bacchus among them) have offered semantics for formal languages in which inferences concerning probabilities can be made. Our concern is different. This paper provides a formalization of nonmonotonic inferences in which the conclusion is supported only to a certain degree. Such inferences are clearly 'invalid' since they must allow the falsity of a conclusion even when the premises are true. Nevertheless, such inferences can be characterized both syntactically and semantically. The 'premises' of probabilistic arguments are sets of statements (as in a database or knowledge base), the conclusions categorical statements in the language. We provide standards for both this form of inference, for which high probability is required, and for an inference in which the conclusion is qualified by an intermediate interval of support.


Lattice-Based Graded Logic: a Multimodal Approach

arXiv.org Artificial Intelligence

Experts do not always feel very, comfortable when they have to give precise numerical estimations of certainty degrees. In this paper we present a qualitative approach which allows for attaching partially ordered symbolic grades to logical formulas. Uncertain information is expressed by means of parameterized modal operators. We propose a semantics for this multimodal logic and give a sound and complete axiomatization. We study the links with related approaches and suggest how this framework might be used to manage both uncertain and incomplere knowledge.


EigenGP: Sparse Gaussian process models with data-dependent eigenfunctions

arXiv.org Machine Learning

Gaussian processes (GPs) provide a nonparametric representation of functions. However, classical GP inference suffers from high computational cost and it is difficult to design nonstationary GP priors in practice. In this paper, we propose a sparse Gaussian process model, EigenGP, based on the Karhunen-Loeve (KL) expansion of a GP prior. We use the Nystrom approximation to obtain data dependent eigenfunctions and select these eigenfunctions by evidence maximization. This selection reduces the number of eigenfunctions in our model and provides a nonstationary covariance function. To handle nonlinear likelihoods, we develop an efficient expectation propagation (EP) inference algorithm, and couple it with expectation maximization for eigenfunction selection. Because the eigenfunctions of a Gaussian kernel are associated with clusters of samples - including both the labeled and unlabeled - selecting relevant eigenfunctions enables EigenGP to conduct semi-supervised learning. Our experimental results demonstrate improved predictive performance of EigenGP over alternative state-of-the-art sparse GP and semisupervised learning methods for regression, classification, and semisupervised classification.


aHUGIN: A System Creating Adaptive Causal Probabilistic Networks

arXiv.org Artificial Intelligence

The paper describes aHUGIN, a tool for creating adaptive systems. aHUGIN is an extension of the HUGIN shell, and is based on the methods reported by Spiegelhalter and Lauritzen (1990a). The adaptive systems resulting from aHUGIN are able to adjust the C011ditional probabilities in the model. A short analysis of the adaptation task is given and the features of aHUGIN are described. Finally a session with experiments is reported and the results are discussed.