Directed Networks
Smoothed Online Optimization for Regression and Control
We consider Online Convex Optimization (OCO) in the setting where the costs are $m$-strongly convex and the online learner pays a switching cost for changing decisions between rounds. We show that the recently proposed Online Balanced Descent (OBD) algorithm is constant competitive in this setting, with competitive ratio $3 + O(1/m)$, irrespective of the ambient dimension. Additionally, we show that when the sequence of cost functions is $\epsilon$-smooth, OBD has near-optimal dynamic regret and maintains strong per-round accuracy. We demonstrate the generality of our approach by showing that the OBD framework can be used to construct competitive algorithms for a variety of online problems across learning and control, including online variants of ridge regression, logistic regression, maximum likelihood estimation, and LQR control.
BCMA-ES: A Bayesian approach to CMA-ES
Benhamou, Eric, Saltiel, David, Verel, Sebastien, Teytaud, Fabien
In a nutshell, the (ยต / ฮป) CMA-ES is an iterative black box optimization algorithm, that, in each of its iterations, samples ฮป candidate This paper introduces a novel theoretically sound approach for solutions from a multivariate normal distribution, evaluates the celebrated CMA-ES algorithm. Assuming the parameters of these solutions (sequentially or in parallel) retains ยต candidates the multi variate normal distribution for the minimum follow a and adjusts the sampling distribution used for the next iteration conjugate prior distribution, we derive their optimal update at to give higher probability to good samples. Each iteration can be each iteration step. Not only provides this Bayesian framework a individually seen as taking an initial guess or prior for the multi justification for the update of the CMA-ES algorithm but it also gives variate parameters, namely the mean and the covariance, and after two new versions of CMA-ES either assuming normal-Wishart or making an experiment by evaluating these sample points with the normal-Inverse Wishart priors, depending whether we parametrize fit function updating the initial parameters accordingly.
Correlated Parameters to Accurately Measure Uncertainty in Deep Neural Networks
Posch, Konstantin, Pilz, Jรผrgen
In this article a novel approach for training deep neural networks using Bayesian techniques is presented. The Bayesian methodology allows for an easy evaluation of model uncertainty and additionally is robust to overfitting. These are commonly the two main problems classical, i.e. non-Bayesian, architectures have to struggle with. The proposed approach applies variational inference in order to approximate the intractable posterior distribution. In particular, the variational distribution is defined as product of multiple multivariate normal distributions with tridiagonal covariance matrices. Each single normal distribution belongs either to the weights, or to the biases corresponding to one network layer. The layer-wise a posteriori variances are defined based on the corresponding expectation values and further the correlations are assumed to be identical. Therefore, only a few additional parameters need to be optimized compared to non-Bayesian settings. The novel approach is successfully evaluated on basis of the popular benchmark datasets MNIST and CIFAR-10.
Experiments on Open-Set Speaker Identification with Discriminatively Trained Neural Networks
Imoscopi, Stefano, Grancharov, Volodya, Sverrisson, Sigurdur, Karlsson, Erlendur, Pobloth, Harald
This paper presents a study on discriminative artificial neural network classifiers in the context of open-set speaker identification. Both 2-class and multi-class architectures are tested against the conventional Gaussian mixture model based classifier on enrolled speaker sets of different sizes. The performance evaluation shows that the multi-class neural network system has superior performance for large population sizes.
A Gaussian process latent force model for joint input-state estimation in linear structural systems
Nayek, Rajdip, Chakraborty, Souvik, Narasimhan, Sriram
The problem of combined state and input estimation of linear structural systems based on measured responses and a priori knowledge of structural model is considered. A novel methodology using Gaussian process latent force models is proposed to tackle the problem in a stochastic setting. Gaussian process latent force models (GPLFMs) are hybrid models that combine differential equations representing a physical system with data-driven non-parametric Gaussian process models. In this work, the unknown input forces acting on a structure are modelled as Gaussian processes with some chosen covariance functions which are combined with the mechanistic differential equation representing the structure to construct a GPLFM. The GPLFM is then conveniently formulated as an augmented stochastic state-space model with additional states representing the latent force components, and the joint input and state inference of the resulting model is implemented using Kalman filter. The augmented state-space model of GPLFM is shown as a generalization of the class of input-augmented state-space models, is proven observable, and is robust compared to conventional augmented formulations in terms of numerical stability. The hyperparameters governing the covariance functions are estimated using maximum likelihood optimization based on the observed data, thus overcoming the need for manual tuning of the hyperparameters by trial-and-error. To assess the performance of the proposed GPLFM method, several cases of state and input estimation are demonstrated using numerical simulations on a 10-dof shear building and a 76-storey ASCE benchmark office tower. Results obtained indicate the superior performance of the proposed approach over conventional Kalman filter based approaches.
Machine Learning, Big Data, And Smart Buildings: A Comprehensive Survey
Qolomany, Basheer, Al-Fuqaha, Ala, Gupta, Ajay, Benhaddou, Driss, Alwajidi, Safaa, Qadir, Junaid, Fong, Alvis C.
Future buildings will offer new convenience, comfort, and efficiency possibilities to their residents. Changes will occur to the way people live as technology involves into people's lives and information processing is fully integrated into their daily living activities and objects. The future expectation of smart buildings includes making the residents' experience as easy and comfortable as possible. The massive streaming data generated and captured by smart building appliances and devices contains valuable information that needs to be mined to facilitate timely actions and better decision making. Machine learning and big data analytics will undoubtedly play a critical role to enable the delivery of such smart services. In this paper, we survey the area of smart building with a special focus on the role of techniques from machine learning and big data analytics. This survey also reviews the current trends and challenges faced in the development of smart building services.
Learning Personalized Thermal Preferences via Bayesian Active Learning with Unimodality Constraints
Awalgaonkar, Nimish, Bilionis, Ilias, Liu, Xiaoqi, Karava, Panagiota, Tzempelikos, Athanasios
Thermal preferences vary from person to person and may change over time. The main objective of this paper is to sequentially pose intelligent queries to occupants in order to optimally learn the indoor air temperature values which maximize their satisfaction. Our central hypothesis is that an occupant's preference relation over indoor air temperature can be described using a scalar function of these temperatures, which we call the "occupant's thermal utility function". Information about an occupant's preference over these temperatures is available to us through their response to thermal preference queries : "prefer warmer," "prefer cooler" and "satisfied" which we interpret as statements about the derivative of their utility function, i.e. the utility function is "increasing", "decreasing" and "constant" respectively. We model this hidden utility function using a Gaussian process prior with built-in unimodality constraint, i.e., the utility function has a unique maximum, and we train this model using Bayesian inference. This permits an expected improvement based selection of next preference query to pose to the occupant, which takes into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling from areas which are likely to offer an improvement over current best observation). We use this framework to sequentially design experiments and illustrate its benefits by showing that it requires drastically fewer observations to learn the maximally preferred temperature values as compared to other methods. This framework is an important step towards the development of intelligent HVAC systems which would be able to respond to occupants' personalized thermal comfort needs. In order to encourage the use of our PE framework and ensure reproducibility in results, we publish an implementation of our work named GPPrefElicit as an open-source package in Python.
Robust Optimisation Monte Carlo
Ikonomov, Borislav, Gutmann, Michael U.
This paper is on Bayesian inference for parametric statistical models that are implicitly defined by a stochastic simulator which specifies how data is generated. While exact sampling is possible, evaluating the likelihood function is typically prohibitively expensive. Approximate Bayesian Computation (ABC) is a framework to perform approximate inference in such situations. While basic ABC algorithms are widely applicable, they are notoriously slow and much research has focused on increasing their efficiency. Optimisation Monte Carlo (OMC) has recently been proposed as an efficient and embarrassingly parallel method that leverages optimisation to accelerate the inference. In this paper, we demonstrate a previously unrecognised important failure mode of OMC: It generates strongly overconfident approximations by collapsing regions of similar or near-constant posterior density into a single point. We propose an efficient, robust generalisation of OMC that corrects this. It makes fewer assumptions, retains the main benefits of OMC, and can be performed either as part of OMC or entirely as post-processing. We demonstrate the effectiveness of the proposed Robust OMC on toy examples and tasks in inverse-graphics where we perform Bayesian inference with a complex image renderer.
Types of classification algorithms in Machine Learning
In machine learning and statistics, classification is a supervised learning approach in which the computer program learns from the data input given to it and then uses this learning to classify new observation. This data set may simply be bi-class (like identifying whether the person is male or female or that the mail is spam or non-spam) or it may be multi-class too. Some examples of classification problems are: speech recognition, handwriting recognition, bio metric identification, document classification etc. It is a classification technique based on Bayes' Theorem with an assumption of independence among predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.
Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks
Paez, Marina S., Amini, Arash A., Lin, Lizhen
Multiplex networks have become increasingly more prevalent in many fields, and have emerged as a powerful tool for modeling the complexity of real networks. There is a critical need for developing inference models for multiplex networks that can take into account potential dependencies across different layers, particularly when the aim is community detection. We add to a limited literature by proposing a novel and efficient Bayesian model for community detection in multiplex networks. A key feature of our approach is the ability to model varying communities at different network layers. In contrast, many existing models assume the same communities for all layers. Moreover, our model automatically picks up the necessary number of communities at each layer (as validated by real data examples). This is appealing, since deciding the number of communities is a challenging aspect of community detection, and especially so in the multiplex setting, if one allows the communities to change across layers. Borrowing ideas from hierarchical Bayesian modeling, we use a hierarchical Dirichlet prior to model community labels across layers, allowing dependency in their structure. Given the community labels, a stochastic block model (SBM) is assumed for each layer. We develop an efficient slice sampler for sampling the posterior distribution of the community labels as well as the link probabilities between communities. In doing so, we address some unique challenges posed by coupling the complex likelihood of SBM with the hierarchical nature of the prior on the labels. An extensive empirical validation is performed on simulated and real data, demonstrating the superior performance of the model over single-layer alternatives, as well as the ability to uncover interesting structures in real networks.