Uncertainty
Modelling Seasonality and Trends in Daily Rainfall Data
This paper presents a new approach to the problem of modelling daily rainfall using neural networks. We first model the conditional distributions of rainfall amounts, in such a way that the model itself determines the order of the process, and the time-dependent shape and scale of the conditional distributions. After integrating over particular weather patterns, we are able to extract seasonal variations and long-term trends. 1 Introduction Analysis of rainfall data is important for many agricultural, ecological and engineering activities. Design of irrigation and drainage systems, for instance, needs to take account not only of mean expected rainfall, but also of rainfall volatility. Estimates of crop yields also depend on the distribution of rainfall during the growing season, as well as on the overall amount.
Experiences with Bayesian Learning in a Real World Application
Sykacek, Peter, Dorffner, Georg, Rappelsberger, Peter, Zeitlhofer, Josef
This paper reports about an application of Bayes' inferred neural network classifiers in the field of automatic sleep staging. The reason for using Bayesian learning for this task is twofold. First, Bayesian inference is known to embody regularization automatically. Second, a side effect of Bayesian learning leads to larger variance of network outputs in regions without training data. This results in well known moderation effects, which can be used to detect outliers. In a 5 fold cross-validation experiment the full Bayesian solution found with R. Neals hybrid Monte Carlo algorithm, was not better than a single maximum a-posteriori (MAP) solution found with D.J. MacKay's evidence approximation. In a second experiment we studied the properties of both solutions in rejecting classification of movement artefacts.
Recovering Perspective Pose with a Dual Step EM Algorithm
Cross, Andrew D. J., Hancock, Edwin R.
This paper describes a new approach to extracting 3D perspective structure from 2D point-sets. The novel feature is to unify the tasks of estimating transformation geometry and identifying pointcorrespondence matches. Unification is realised by constructing a mixture model over the bipartite graph representing the correspondence match and by effecting optimisation using the EM algorithm. According to our EM framework the probabilities of structural correspondence gate contributions to the expected likelihood function used to estimate maximum likelihood perspective pose parameters. This provides a means of rejecting structural outliers.
Blind Separation of Radio Signals in Fading Channels
We apply information maximization / maximum likelihood blind source separation [2, 6) to complex valued signals mixed with complex valued nonstationary matrices. This case arises in radio communications with baseband signals. We incorporate known source signal distributions in the adaptation, thus making the algorithms less "blind". This results in drastic reduction of the amount of data needed for successful convergence. Adaptation to rapidly changing signal mixing conditions, such as to fading in mobile communications, becomes now feasible as demonstrated by simulations. 1 Introduction In SDMA (spatial division multiple access) the purpose is to separate radio signals of interfering users (either intentional or accidental) from each others on the basis of the spatial characteristics of the signals using smart antennas, array processing, and beamforming [5, 8).
Bayesian Robustification for Audio Visual Fusion
Movellan, Javier R., Mineiro, Paul
Department of Cognitive Science Department of Cognitive Science University of California, San Diego University of California, San Diego La Jolla, CA 92092-0515 La Jolla, CA 92092-0515 Abstract We discuss the problem of catastrophic fusion in multimodal recognition systems. This problem arises in systems that need to fuse different channels in non-stationary environments. Practice shows that when recognition modules within each modality are tested in contexts inconsistent with their assumptions, their influence on the fused product tends to increase, with catastrophic results. We explore a principled solution to this problem based upon Bayesian ideas of competitive models and inference robustification: each sensory channel is provided with simple white-noise context models, and the perceptual hypothesis and context are jointly estimated. Consequently, context deviations are interpreted as changes in white noise contamination strength, automatically adjusting the influence of the module.
Stacked Density Estimation
Smyth, Padhraic, Wolpert, David
One frequently estimates density functions for which there is little prior knowledge on the shape of the density and for which one wants a flexible and robust estimator (allowing multimodality if it exists). In this context, the methods of choice tend to be finite mixture models and kernel density estimation methods. For mixture modeling, mixtures of Gaussian components are frequently assumed and model choice reduces to the problem of choosing the number k of Gaussian components in the model (Titterington, Smith and Makov, 1986). For kernel density estimation, kernel shapes are typically chosen from a selection of simple unimodal densities such as Gaussian, triangular, or Cauchy densities, and kernel bandwidths are selected in a data-driven manner (Silverman 1986; Scott 1994). As argued by Draper (1996), model uncertainty can contribute significantly to pre- - Also with the Jet Propulsion Laboratory 525-3660, California Institute of Technology, Pasadena, CA 91109 Stacked Density Estimation 669 dictive error in estimation. While usually considered in the context of supervised learning, model uncertainty is also important in unsupervised learning applications such as density estimation. Even when the model class under consideration contains the true density, if we are only given a finite data set, then there is always a chance of selecting the wrong model. Moreover, even if the correct model is selected, there will typically be estimation error in the parameters of that model.
An Incremental Nearest Neighbor Algorithm with Queries
We consider the general problem of learning multi-category classification from labeled examples. We present experimental results for a nearest neighbor algorithm which actively selects samples from different pattern classes according to a querying rule instead of the a priori class probabilities. The amount of improvement of this query-based approach over the passive batch approach depends on the complexity of the Bayes rule. The principle on which this algorithm is based is general enough to be used in any learning algorithm which permits a model-selection criterion and for which the error rate of the classifier is calculable in terms of the complexity of the model. 1 INTRODUCTION We consider the general problem of learning multi-category classification from labeled examples. In many practical learning settings the time or sample size available for training are limited. This may have adverse effects on the accuracy of the resulting classifier. For instance, in learning to recognize handwritten characters typical time limitation confines the training sample size to be of the order of a few hundred examples. It is important to make learning more efficient by obtaining only training data which contains significant information about the separability of the pattern classes thereby letting the learning algorithm participate actively in the sampling process. Querying for the class labels of specificly selected examples in the input space may lead to significant improvements in the generalization error (cf.
Learning Nonlinear Overcomplete Representations for Efficient Coding
Lewicki, Michael S., Sejnowski, Terrence J.
We derive a learning algorithm for inferring an overcomplete basis by viewing it as probabilistic model of the observed data. Overcomplete bases allow for better approximation of the underlying statistical density. Using a Laplacian prior on the basis coefficients removes redundancy and leads to representations that are sparse and are a nonlinear function of the data. This can be viewed as a generalization of the technique of independent component analysis and provides a method for blind source separation of fewer mixtures than sources. We demonstrate the utility of overcomplete representations on natural speech and show that compared to the traditional Fourier basis the inferred representations potentially have much greater coding efficiency.