Distributed electricity producers, such as small wind farms and solar installations, pose several technical and economic challenges in Smart Grid design. One approach to addressing these challenges is through Broker Agents who buy electricity from distributed producers, and also sell electricity to consumers, via a Tariff Market--a new market mechanism where Broker Agents publish concurrent bid and ask prices. We investigate the learning of pricing strategies for an autonomous Broker Agent to profitably participate in a Tariff Market. We employ Markov Decision Processes (MDPs) and reinforcement learning. An important concern with this method is that even simple representations of the problem domain result in very large numbers of states in the MDP formulation because market prices can take nearly arbitrary real values. In this paper, we present the use of derived state space features, computed using statistics on Tariff Market prices and Broker Agent customer portfolios, to obtain a scalable state representation. We also contribute a set of pricing tactics that form building blocks in the learned Broker Agent strategy. We further present a Tariff Market simulation model based on real-world data and anticipated market dynamics. We use this model to obtain experimental results that show the learned strategy performing vastly better than a random strategy and significantly better than two other non-learning strategies.
Sustainable resource management in many domains presents large continuous stochastic optimization problems, which can often be modeled as Markov decision processes (MDPs). To solve such large MDPs, we identify and leverage linearity in state and action sets that is common in resource management. In particular, we introduce linear dynamic programs (LDPs) that generalize resource management problems and partially observable MDPs (POMDPs). We show that the LDP framework makes it possible to adapt point-based methods--the state of the art in solving POMDPs--to solving LDPs. The experimental results demonstrate the efficiency of this approach in managing the water level of a river reservoir. Finally, we discuss the relationship with dual dynamic programming, a method used to optimize hydroelectric systems.
Flower pollination algorithm is a recent metaheuristic algorithm for solving nonlinear global optimization problems. The algorithm has also been extended to solve multiobjective optimization with promising results. In this work, we analyze this algorithm mathematically and prove its convergence properties by using Markov chain theory. By constructing the appropriate transition probability for a population of flower pollen and using the homogeneity property, it can be shown that the constructed stochastic sequences can converge to the optimal set. Under the two proper conditions for convergence, it is proved that the simplified flower pollination algorithm can indeed satisfy these convergence conditions and thus the global convergence of this algorithm can be guaranteed. Numerical experiments are used to demonstrate that the flower pollination algorithm can converge quickly in practice and can thus achieve global optimality efficiently.
External factors are hard to model using a Markovian state in several real-world planning domains. Although planning can be difficult in such domains, it may be possible to exploit long-term dependencies between states of the environment during planning. We introduce weighted state scenarios to model long-term sequences of states, and we use a model based on a Partially Observable Markov Decision Process to reason about scenarios during planning. Experiments show that our model outperforms other methods for decision making in two real-world domains.
Accurate short-term wind forecasts (STWFs), with time horizons from 0.5 to 6 hours, are essential for efficient integration of wind power to the electrical power grid. Physical models based on numerical weather predictions are currently not competitive, and research on machine learning approaches is ongoing. Two major challenges confronting these efforts are missing observations and weather-regime induced dependency shifts among wind variables at geographically distributed sites. In this paper we introduce approaches that address both of these challenges. We describe a new regime-aware approach to STWF that use auto-regressive hidden Markov models (AR-HMM), a subclass of conditional linear Gaussian (CLG) models. Although AR-HMMs are a natural representation for weather regimes, as with CLG models in general, exact inference is NP-hard when observations are missing (Lerner and Parr, 2001). Because of this high cost, we introduce a simple approximate inference method for AR-HMMs, which we believe has applications to other sequential and temporal problem domains that involve continuous variables. In an empirical evaluation on publicly available wind data from two geographically distinct regions, our approach makes significantly more accurate predictions than baseline models, and uncovers meteorologically relevant regimes.