AITopics

We establish a principled framework for adaptive transform coding. Transformcoders are often constructed by concatenating an ad hoc choice of transform with suboptimal bit allocation and quantizer design.Instead, we start from a probabilistic latent variable model in the form of a mixture of constrained Gaussian mixtures. From this model we derive a transform coding algorithm, which is a constrained version of the generalized Lloyd algorithm for vector quantizer design. A byproduct of our derivation is the introduction ofa new transform basis, which unlike other transforms (PCA, DCT, etc.) is explicitly optimized for coding. Image compression experiments show adaptive transform coders designed with our algorithm improvecompressed image signal-to-noise ratio up to 3 dB compared to global transform coding and 0.5 to 2 dB compared to other adaptive transform coders. 1 Introduction Compression algorithms for image and video signals often use transform coding as a low-complexity alternative to vector quantization (VQ).

algorithm, coder, transform coder, (15 more...)

Country:

North America > United States > Oregon > Washington County > Beaverton (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Kjems, Ulrik, Hansen, Lars Kai, Strother, Stephen C.

Generalizable Singular Value Decomposition for Ill-posed Datasets

Becausethe training examples in an ill-posed data set do not fully span the signal space the observed training set variances in each basis vector will be too high compared to the average variance ofthe test set projections onto the same basis vectors. On basis of this understanding we introduce the Generalizable Singular ValueDecomposition (GenSVD) as a means to reduce this bias by re-estimation of the singular values obtained in a conventional Singular Value Decomposition, allowing for a generalization performance increaseof a subsequent statistical model. We demonstrate that the algorithm succesfully corrects bias in a data set from a functional PET activation study of the human brain. 1 Ill-posed Data Sets An ill-posed data set has more dimensions in each example than there are examples. Such data sets occur in many fields of research typically in connection with image measurements. The associated statistical problem is that of extracting structure from the observed high-dimensional vectors in the presence of noise. The statistical analysis can be done either supervised (Le.

ill-posed data, projection, variance, (12 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
Europe > Germany (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Smola, Alex J., Bartlett, Peter L.

Sparse Greedy Gaussian Process Regression

We present a simple sparse greedy technique to approximate the maximum a posteriori estimate of Gaussian Processes with much improved scaling behaviour in the sample size m.

approximation, log posterior, matrix, (12 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Lee, Daniel D., Seung, H. Sebastian

Algorithms for Non-negative Matrix Factorization

Nonnegative matrix factorization (NMF) has previously been shown to be a useful decomposition for multivariate data. Two different multiplicative algorithmsfor NMF are analyzed. They differ only slightly in the multiplicative factor used in the update rules. One algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence. The monotonic convergence of both algorithms can be proven using an auxiliary function analogousto that used for proving convergence of the Expectation Maximization algorithm. The algorithms can also be interpreted as diagonally rescaledgradient descent, where the rescaling factor is optimally chosen to ensure convergence.

algorithm, matrix factorization, update rule, (16 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

Graepel, Thore, Herbrich, Ralf

The Kernel Gibbs Sampler

We present an algorithm that samples the hypothesis space of kernel classifiers.Given a uniform prior over normalised weight vectors and a likelihood based on a model of label noise leads to a piecewise constantposterior that can be sampled by the kernel Gibbs sampler (KGS). The KGS is a Markov Chain Monte Carlo method that chooses a random direction in parameter space and samples from the resulting piecewise constant density along the line chosen. The KGS can be used as an analytical tool for the exploration of Bayesian transduction, Bayes point machines, active learning, and evidence-based model selection on small data sets that are contaminated withlabel noise. For a simple toy example we demonstrate experimentally how a Bayes point machine based on the KGS outperforms anSVM that is incapable of taking into account label noise. 1 Introduction Two great ideas have dominated recent developments in machine learning: the application ofkernel methods and the popularisation of Bayesian inference. Focusing on the task of classification, various connections between the two areas exist: kernels havelong been a part of Bayesian inference in the disguise of covariance nmctions thatcharacterise priors over functions [9].

classifier, label noise, posterior distribution, (10 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Berlin (0.05)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Csató, Lehel, Opper, Manfred

Sparse Representation for Gaussian Process Models

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm togetherwith a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental resultson toy examples and large real-world datasets indicate the efficiency of the approach.

approximation, gaussian process, vector, (15 more...)

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Modeling & Simulation (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Analysis of Bit Error Probability of Direct-Sequence CDMA Multiuser Demodulators

Tanaka, Toshiyuki

We analyze the bit error probability of multiuser demodulators for directsequence binaryphase-shift-keying (DSIBPSK) CDMA channel with additive gaussian noise. The problem of multiuser demodulation is cast into the finite-temperature decoding problem, and replica analysis is applied toevaluate the performance of the resulting MPM (Marginal Posterior Mode)demodulators, which include the optimal demodulator and the MAP demodulator as special cases. An approximate implementation ofdemodulators is proposed using analog-valued Hopfield model as a naive mean-field approximation to the MPM demodulators, and its performance is also evaluated by the replica analysis. Results of the performance evaluationshows effectiveness of the optimal demodulator and the mean-field demodulator compared with the conventional one, especially inthe cases of small information bit rate and low noise level. 1 Introduction The CDMA (Code-Division-Multiple-Access) technique [1] is important as a fundamental technology of digital communications systems, such as cellular phones. The important applications includerealization of spread-spectrum multipoint-to-point communications systems, in which multiple users share the same communication channel.

demodulator, map demodulator, sequence, (13 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Telecommunications (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.47)
Information Technology > Communications > Networks (0.35)

Kappen, Hilbert J., Wiegerinck, Wim

Second Order Approximations for Probability Models

In this paper, we derive a second order mean field theory for directed graphical probability models. By using an information theoretic argument itis shown how this can be done in the absense of a partition function. This method is a direct generalisation of the well-known TAP approximation for Boltzmann Machines. In a numerical example, it is shown that the method greatly improves the first order mean field approximation. Fora restricted class of graphical models, so-called single overlap graphs, the second order method has comparable complexity to the first order method. For sigmoid belief networks, the method is shown to be particularly fast and effective.

approximation, graphical model, sigmoid belief network, (14 more...)

Country:

Europe > Netherlands > Gelderland > Nijmegen (0.05)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Whence Sparseness?

Vreeswijk, Carl van

It has been shown that the receptive fields of simple cells in VI can be explained byassuming optimal encoding, provided that an extra constraint of sparseness is added. This finding suggests that there is a reason, independent ofoptimal representation, for sparseness. However this work used an ad hoc model for the noise. Here I show that, if a biologically more plausible noise model, describing neurons as Poisson processes, is used sparseness does not have to be added as a constraint. Thus I conclude thatsparseness is not a feature that evolution has striven for, but is simply the result of the evolutionary pressure towards an optimal representation. 1 Introduction Recently there has been an resurgence of interest in using optimal coding strategies to'explain' the response properties of neuron in the primary sensory areas [1].

constraint, mutual information, neuron, (14 more...)

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Natschläger, Thomas, Maass, Wolfgang, Sontag, Eduardo D., Zador, Anthony M.

Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics

Experimental data show that biological synapses behave quite differently from the symbolic synapses in common artificial neural network models. Biological synapses are dynamic, i.e., their "weight" changes on a short time scale by several hundred percent in dependence of the past input to the synapse. In this article we explore the consequences that these synaptic dynamics entail for the computational power of feedforward neural networks. We show that gradient descent suffices to approximate a given (quadratic) filter by a rather small neural system with dynamic synapses. We also compare our network model to artificial neural networks designedfor time series processing. Our numerical results are complemented by theoretical analysis which show that even with just a single hidden layer such networks can approximate a surprisingly large large class of nonlinear filters: all filters that can be characterized by Volterra series. This result is robust with regard to various changes in the model for synaptic dynamics.

dynamic network, network model, synapse, (15 more...)

Country:

Europe > Austria > Styria > Graz (0.05)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)