Energy
Online Matrix Completion and Online Robust PCA
This work studies two interrelated problems - online robust PCA (RPCA) and online low-rank matrix completion (MC). In recent work by Cand\`{e}s et al., RPCA has been defined as a problem of separating a low-rank matrix (true data), $L:=[\ell_1, \ell_2, \dots \ell_{t}, \dots , \ell_{t_{\max}}]$ and a sparse matrix (outliers), $S:=[x_1, x_2, \dots x_{t}, \dots, x_{t_{\max}}]$ from their sum, $M:=L+S$. Our work uses this definition of RPCA. An important application where both these problems occur is in video analytics in trying to separate sparse foregrounds (e.g., moving objects) and slowly changing backgrounds. While there has been a large amount of recent work on both developing and analyzing batch RPCA and batch MC algorithms, the online problem is largely open. In this work, we develop a practical modification of our recently proposed algorithm to solve both the online RPCA and online MC problems. The main contribution of this work is that we obtain correctness results for the proposed algorithms under mild assumptions. The assumptions that we need are: (a) a good estimate of the initial subspace is available (easy to obtain using a short sequence of background-only frames in video surveillance); (b) the $\ell_t$'s obey a `slow subspace change' assumption; (c) the basis vectors for the subspace from which $\ell_t$ is generated are dense (non-sparse); (d) the support of $x_t$ changes by at least a certain amount at least every so often; and (e) algorithm parameters are appropriately set
GEFCOM 2014 - Probabilistic Electricity Price Forecasting
Barta, Gergo, Borbely, Gyula, Nagy, Gabor, Kazi, Sandor, Henk, Tamas
Energy price forecasting is a relevant yet hard task in the field of multi-step time series forecasting. In this paper we compare a well-known and established method, ARMA with exogenous variables with a relatively new technique Gradient Boosting Regression. The method was tested on data from Global Energy Forecasting Competition 2014 with a year long rolling window forecast. The results from the experiment reveal that a multi-model approach is significantly better performing in terms of error metrics. Gradient Boosting can deal with seasonality and auto-correlation out-of-the box and achieve lower rate of normalized mean absolute error on real-world data.
Detectability thresholds and optimal algorithms for community structure in dynamic networks
Ghasemian, Amir, Zhang, Pan, Clauset, Aaron, Moore, Cristopher, Peel, Leto
We study the fundamental limits on learning latent community structure in dynamic networks. Specifically, we study dynamic stochastic block models where nodes change their community membership over time, but where edges are generated independently at each time step. In this setting (which is a special case of several existing models), we are able to derive the detectability threshold exactly, as a function of the rate of change and the strength of the communities. Below this threshold, we claim that no algorithm can identify the communities better than chance. We then give two algorithms that are optimal in the sense that they succeed all the way down to this limit. The first uses belief propagation (BP), which gives asymptotically optimal accuracy, and the second is a fast spectral clustering algorithm, based on linearizing the BP equations. We verify our analytic and algorithmic results via numerical simulation, and close with a brief discussion of extensions and open questions.
From Pixels to Torques: Policy Learning with Deep Dynamical Models
Wahlstrรถm, Niklas, Schรถn, Thomas B., Deisenroth, Marc Peter
Data-efficient learning in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. In this paper, we consider one instance of this challenge, the pixels to torques problem, where an agent must learn a closed-loop control policy from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning ensures that not only static but also dynamic properties of the data are accounted for. This is crucial for long-term predictions, which lie at the core of the adaptive model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art reinforcement learning methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces and is an important step toward fully autonomous learning from pixels to torques.
Automated Linear Function Submission-based Double Auction as Bottom-up Real-Time Pricing in a Regional Prosumers' Electricity Network
Taniguchi, Tadahiro, Kawasaki, Koki, Fukui, Yoshiro, Takata, Tomohiro, Yano, Shiro
A linear function submission-based double-auction (LFS-DA) mechanism for a regional electricity network is proposed in this paper. Each agent in the network is equipped with a battery and a generator. Each agent simultaneously becomes a producer and consumer of electricity, i.e., a prosumer and trades electricity in the regional market at a variable price. In the LFS-DA, each agent uses linear demand and supply functions when they submit bids and asks to an auctioneer in the regional market.The LFS-DA can achieve an exact balance between electricity demand and supply for each time slot throughout the learning phase and was shown capable of solving the primal problem of maximizing the social welfare of the network without any central price setter, e.g., a utility or a large electricity company, in contrast with conventional real-time pricing (RTP). This paper presents a clarification of the relationship between the RTP algorithm derived on the basis of a dual decomposition framework and LFS-DA. Specifically, we proved that the changes in the price profile of the LFS-DA mechanism are equal to those achieved by the RTP mechanism derived from the dual decomposition framework except for a constant factor.
NEWS | Freshhh 2015 Winners: Focus, Critical Thinking and Hard Work Pays Off
The winning team of MOL Group's Freshhh 2015 competition truly learned the value of hard work. "Just ask Siri," the first-place winner of the competition in which students from across the globe compete in technology and business strategy simulations related to the oil and gas industry was made up of three students from the University of Economics, Prague and Czech Technical University in Prague, Czech Republic. MOL Group experts are a part of the game development, so they make sure tasks are aligned with real-life situations, said Zdravka Demeter Bubalo, HR vice president of MOL Group. The top teams are invited to compete in the Live Final and present case studies connected to recent industry trends and issues. This strategy allows MOL Group to find the best global talent to join the company.
Understanding Random Forests: From Theory to Practice
Data analysis and machine learning have become an integrative part of the modern scientific methodology, offering automated procedures for the prediction of a phenomenon based on past observations, unraveling underlying patterns in data and providing insights about the problem. Yet, caution should avoid using machine learning as a black-box tool, but rather consider it as a methodology, with a rational thought process that is entirely dependent on the problem under study. In particular, the use of algorithms should ideally require a reasonable understanding of their mechanisms, properties and limitations, in order to better apprehend and interpret their results. Accordingly, the goal of this thesis is to provide an in-depth analysis of random forests, consistently calling into question each and every part of the algorithm, in order to shed new light on its learning capabilities, inner workings and interpretability. The first part of this work studies the induction of decision trees and the construction of ensembles of randomized trees, motivating their design and purpose whenever possible. Our contributions follow with an original complexity analysis of random forests, showing their good computational performance and scalability, along with an in-depth discussion of their implementation details, as contributed within Scikit-Learn. In the second part of this work, we analyse and discuss the interpretability of random forests in the eyes of variable importance measures. The core of our contributions rests in the theoretical characterization of the Mean Decrease of Impurity variable importance measure, from which we prove and derive some of its properties in the case of multiway totally randomized trees and in asymptotic conditions. In consequence of this work, our analysis demonstrates that variable importances [...].
Automatic Inference for Inverting Software Simulators via Probabilistic Programming
Saeedi, Ardavan, Firoiu, Vlad, Mansinghka, Vikash
Models of complex systems are often formalized as sequential software simulators: computationally intensive programs that iteratively build up probable system configurations given parameters and initial conditions. These simulators enable modelers to capture effects that are difficult to characterize analytically or summarize statistically. However, in many real-world applications, these simulations need to be inverted to match the observed data. This typically requires the custom design, derivation and implementation of sophisticated inversion algorithms. Here we give a framework for inverting a broad class of complex software simulators via probabilistic programming and automatic inference, using under 20 lines of probabilistic code. Our approach is based on a formulation of inversion as approximate inference in a simple sequential probabilistic model. We implement four inference strategies, including Metropolis-Hastings, a sequentialized Metropolis-Hastings scheme, and a particle Markov chain Monte Carlo scheme, requiring 4 or fewer lines of probabilistic code each. We demonstrate our framework by applying it to invert a real geological software simulator from the oil and gas industry.
NEWS | MOL Group Announces Freshhh 2015 Winners
MOL Group announced yesterday the winners of the Freshhh 2015 competition, which sees students from all over the world compete in technology and business strategy simulations related to the oil and gas industry. 'Just Ask Siri', consisting of three students from the Prague University of Economics and the Czech Technical University, was awarded first place, with Hungary's'Oil's Creed' and Slovenia's'Decore' teams placing in second and third respectively. All three teams will now be given the opportunity to join MOL Group's graduate recruitment and development program. MOL Group HR Vice President Zdravka Demeter Bubalo commented in a company statement: "We congratulate the top three teams for winning the Freshhh competition 2015. I would like to thank all participants for their endless efforts during the competition. It is incredible to see how young students work with such difficult real-life cases and always find new solutions. The outstanding results from the participants and number of applications are showing us once more that we are heading in the right direction in order to attract top talents of the oil and gas industry."
Weight Uncertainty in Neural Networks
Blundell, Charles, Cornebise, Julien, Kavukcuoglu, Koray, Wierstra, Daan
We introduce a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop. It regularises the weights by minimising a compression cost, known as the variational free energy or the expected lower bound on the marginal likelihood. We show that this principled kind of regularisation yields comparable performance to dropout on MNIST classification. We then demonstrate how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems, and how this weight uncertainty can be used to drive the exploration-exploitation trade-off in reinforcement learning.