Energy
Optimizing Energy Costs in a Zinc and Lead Mine
Kinsella, Alan (Boliden Tara Mines Ltd.) | Smeaton, Alan F. (Insight Centre for Data Analytics) | Hurley, Barry (Insight Centre for Data Analytics) | O' (Insight Centre for Data Analytics) | Sullivan, Barry (Insight Centre for Data Analytics) | Simonis, Helmut
Boliden Tara Mines Ltd. consumed 184.7 GWh of electricity in 2014, equating to over 1% of the national demand of Ireland or approximately 35,000 homes. Irelandโs industrial electricity prices, at an average of 13 c/KWh in 2014, are amongst the most expensive in Europe. Cost effective electricity procurement is ever more pressing for businesses to remain competitive. In parallel, the proliferation of intelligent devices has led to the industrial Internet of Things paradigm becoming mainstream. As more and more devices become equipped with network connectivity, smart metering is fast becoming a means of giving energy users access to a rich array of consumption data. These modern sensor networks have facilitated the development of applications to process, analyse, and react to continuous data streams in real-time. Subsequently, future procurement and consumption decisions can be informed by a highly detailed evaluation of energy usage. With these considerations in mind, this paper uses variable energy prices from Irelandโs Single Electricity Market, along with smart meter sensor data, to simulate the scheduling of an industrial-sized underground pump station in Tara Mines. The objective is to reduce the overall energy costs whilst still functioning within the systemโs operational constraints. An evaluation using real-world electricity prices and detailed sensor data for 2014 demonstrates significant savings of up to 10.72% over the year compared to the existing control systems.
A Tractable Fully Bayesian Method for the Stochastic Block Model
Hayashi, Kohei, Konishi, Takuya, Kawamoto, Tatsuro
The stochastic block model (SBM) is a generative model revealing macroscopic structures in graphs. Bayesian methods are used for (i) cluster assignment inference and (ii) model selection for the number of clusters. In this paper, we study the behavior of Bayesian inference in the SBM in the large sample limit. Combining variational approximation and Laplace's method, a consistent criterion of the fully marginalized log-likelihood is established. Based on that, we derive a tractable algorithm that solves tasks (i) and (ii) concurrently, obviating the need for an outer loop to check all model candidates. Our empirical and theoretical results demonstrate that our method is scalable in computation, accurate in approximation, and concise in model selection.
Boolean Matrix Factorization and Noisy Completion via Message Passing
Ravanbakhsh, Siamak, Poczos, Barnabas, Greiner, Russell
Boolean matrix factorization and Boolean matrix completion from noisy observations are desirable unsupervised data-analysis methods due to their interpretability, but hard to perform due to their NP-hardness. We treat these problems as maximum a posteriori inference problems in a graphical model and present a message passing approach that scales linearly with the number of observations and factors. Our empirical study demonstrates that message passing is able to recover low-rank Boolean matrices, in the boundaries of theoretically possible recovery and compares favorably with state-of-the-art in real-world applications, such collaborative filtering with large-scale Boolean data.
Principal Polynomial Analysis
Laparra, Valero, Jimรฉnez, Sandra, Tuia, Devis, Camps-Valls, Gustau, Malo, Jesรบs
This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet-Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.
Dimensionality Reduction via Regression in Hyperspectral Imagery
Laparra, Valero, Malo, Jesus, Camps-Valls, Gustau
This paper introduces a new unsupervised method for dimensionality reduction via regression (DRR). The algorithm belongs to the family of invertible transforms that generalize Principal Component Analysis (PCA) by using curvilinear instead of linear features. DRR identifies the nonlinear features through multivariate regression to ensure the reduction in redundancy between he PCA coefficients, the reduction of the variance of the scores, and the reduction in the reconstruction error. More importantly, unlike other nonlinear dimensionality reduction methods, the invertibility, volume-preservation, and straightforward out-of-sample extension, makes DRR interpretable and easy to apply. The properties of DRR enable learning a more broader class of data manifolds than the recently proposed Non-linear Principal Components Analysis (NLPCA) and Principal Polynomial Analysis (PPA). We illustrate the performance of the representation in reducing the dimensionality of remote sensing data. In particular, we tackle two common problems: processing very high dimensional spectral information such as in hyperspectral image sounding data, and dealing with spatial-spectral image patches of multispectral images. Both settings pose collinearity and ill-determination problems. Evaluation of the expressive power of the features is assessed in terms of truncation error, estimating atmospheric variables, and surface land cover classification error. Results show that DRR outperforms linear PCA and recently proposed invertible extensions based on neural networks (NLPCA) and univariate regressions (PPA).
Utilisation of Metadata Fields and Query Expansion in Cross-Lingual Search of User-Generated Internet Video
Khwileh, Ahmad, Ganguly, Debasis, J. F. Jones, Gareth
Recent years have seen significant efforts in the area of Cross Language Information Retrieval (CLIR) for text retrieval. This work initially focused on formally published content, but more recently research has begun to concentrate on CLIR for informal social media content. However, despite the current expansion in online multimedia archives, there has been little work on CLIR for this content. While there has been some limited work on Cross-Language Video Retrieval (CLVR) for professional videos, such as documentaries or TV news broadcasts, there has to date, been no significant investigation of CLVR for the rapidly growing archives of informal user generated (UGC) content. Key differences between such UGC and professionally produced content are the nature and structure of the textual UGC metadata associated with it, as well as the form and quality of the content itself. In this setting, retrieval effectiveness may not only suffer from translation errors common to all CLIR tasks, but also recognition errors associated with the automatic speech recognition (ASR) systems used to transcribe the spoken content of the video and with the informality and inconsistency of the associated user-created metadata for each video. This work proposes and evaluates techniques to improve CLIR effectiveness of such noisy UGC content. Our experimental investigation shows that different sources of evidence, e.g. the content from different fields of the structured metadata, significantly affect CLIR effectiveness. Results from our experiments also show that each metadata field has a varying robustness to query expansion (QE) and hence can have a negative impact on the CLIR effectiveness. Our work proposes a novel adaptive QE technique that predicts the most reliable source for expansion and shows how this technique can be effective for improving the CLIR effectiveness for UGC content.
Online Optimization in Dynamic Environments
Hall, Eric C., Willett, Rebecca M.
High-velocity streams of high-dimensional data pose significant "big data" analysis challenges across a range of applications and settings. Online learning and online convex programming play a significant role in the rapid recovery of important or anomalous information from these large datastreams. While recent advances in online learning have led to novel and rapidly converging algorithms, these methods are unable to adapt to nonstationary environments arising in real-world problems. This paper describes a dynamic mirror descent framework which addresses this challenge, yielding low theoretical regret bounds and accurate, adaptive, and computationally efficient algorithms which are applicable to broad classes of problems. The methods are capable of learning and adapting to an underlying and possibly time-varying dynamical model. Empirical results in the context of dynamic texture analysis, solar flare detection, sequential compressed sensing of a dynamic scene, traffic surveillance,tracking self-exciting point processes and network behavior in the Enron email corpus support the core theoretical findings.
Invariant backpropagation: how to train a transformation-invariant neural network
Demyanov, Sergey, Bailey, James, Kotagiri, Ramamohanarao, Leckie, Christopher
In many classification problems a classifier should be robust to small variations in the input vector. This is a desired property not only for particular transformations, such as translation and rotation in image classification problems, but also for all others for which the change is small enough to retain the object perceptually indistinguishable. We propose two extensions of the backpropagation algorithm that train a neural network to be robust to variations in the feature vector. While the first of them enforces robustness of the loss function to all variations, the second method trains the predictions to be robust to a particular variation which changes the loss function the most. The second methods demonstrates better results, but is slightly slower. We analytically compare the proposed algorithm with two the most similar approaches (Tangent BP and Adversarial Training), and propose their fast versions. In the experimental part we perform comparison of all algorithms in terms of classification accuracy and robustness to noise on MNIST and CIFAR-10 datasets. Additionally we analyze how the performance of the proposed algorithm depends on the dataset size and data augmentation.
Infomax strategies for an optimal balance between exploration and exploitation
Reddy, Gautam, Celani, Antonio, Vergassola, Massimo
Proper balance between exploitation and exploration is what makes good decisions, which achieve high rewards like payoff or evolutionary fitness. The Infomax principle postulates that maximization of information directs the function of diverse systems, from living systems to artificial neural networks. While specific applications are successful, the validity of information as a proxy for reward remains unclear. Here, we consider the multi-armed bandit decision problem, which features arms (slot-machines) of unknown probabilities of success and a player trying to maximize cumulative payoff by choosing the sequence of arms to play. We show that an Infomax strategy (Info-p) which optimally gathers information on the highest mean reward among the arms saturates known optimal bounds and compares favorably to existing policies. The highest mean reward considered by Info-p is not the quantity actually needed for the choice of the arm to play, yet it allows for optimal tradeoffs between exploration and exploitation.
Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models
Gutmann, Michael U., Corander, Jukka
Our paper deals with inferring simulator-based statistical models given some observed data. A simulator-based model is a parametrized mechanism which specifies how data are generated. It is thus also referred to as generative model. We assume that only a finite number of parameters are of interest and allow the generative process to be very general; it may be a noisy nonlinear dynamical system with an unrestricted number of hidden variables. This weak assumption is useful for devising realistic models but it renders statistical inference very difficult. The main challenge is the intractability of the likelihood function. Several likelihood-free inference methods have been proposed which share the basic idea of identifying the parameters by finding values for which the discrepancy between simulated and observed data is small. A major obstacle to using these methods is their computational cost. The cost is largely due to the need to repeatedly simulate data sets and the lack of knowledge about how the parameters affect the discrepancy. We propose a strategy which combines probabilistic modeling of the discrepancy with optimization to facilitate likelihood-free inference. The strategy is implemented using Bayesian optimization and is shown to accelerate the inference through a reduction in the number of required simulations by several orders of magnitude.