Maximov, Yury
Cascading Blackout Severity Prediction with Statistically-Augmented Graph Neural Networks
Gorka, Joe, Hsu, Tim, Li, Wenting, Maximov, Yury, Roald, Line
Higher variability in grid conditions, resulting from growing renewable penetration and increased incidence of extreme weather events, has increased the difficulty of screening for scenarios that may lead to catastrophic cascading failures. Traditional power-flow-based tools for assessing cascading blackout risk are too slow to properly explore the space of possible failures and load/generation patterns. We add to the growing literature of faster graph-neural-network (GNN)-based techniques, developing two novel techniques for the estimation of blackout magnitude from initial grid conditions. First we propose several methods for employing an initial classification step to filter out safe "non blackout" scenarios prior to magnitude estimation. Second, using insights from the statistical properties of cascading blackouts, we propose a method for facilitating non-local message passing in our GNN models. We validate these two approaches on a large simulated dataset, and show the potential of both to increase blackout size estimation performance.
Long-term drought prediction using deep neural networks based on geospatial weather data
Grabar, Vsevolod, Marusov, Alexander, Maximov, Yury, Sotiriadi, Nazar, Bulkin, Alexander, Zaytsev, Alexey
The importance of monitoring and predicting droughts is underscored by their frequent occurrence in diverse geographical landscapes (Ghozat et al., 2023). Moreover, the likelihood of droughts is expected to increase in the context of global climate change (Xiujia et al., 2022). Their accurate forecasting, however, is a complex problem due to the inherent difficulty in predicting the onset, duration, and cessation of drought events (Mishra and Desai, 2005). This complexity necessitates the development of sophisticated forecasting models that can effectively navigate these challenges. To frame our problem, it is essential to define the prediction target and establish a suitable time horizon for forecasting (Zhang et al., 2019). Given our focus on long-term decision-making, we aim to generate forecasts that extend 12 months into the future. Selecting an appropriate target for drought prediction is more challenging due to its dependence on multiple climatic factors, including temperature and precipitation. Among the various drought severity indices, the Standardized Precipitation Index (SPI) (McKee et al., 1993) and the Palmer Drought Severity Index (PDSI) (Alley, 1984) stand out as fundamental measures.
Climate Change Impact on Agricultural Land Suitability: An Interpretable Machine Learning-Based Eurasia Case Study
Shevchenko, Valeriy, Taniushkina, Daria, Lukashevich, Aleksander, Bulkin, Aleksandr, Grinis, Roland, Kovalev, Kirill, Narozhnaia, Veronika, Sotiriadi, Nazar, Krenke, Alexander, Maximov, Yury
The United Nations has identified improving food security and reducing hunger as essential components of its sustainable development goals. As of 2021, approximately 828 million people worldwide are experiencing hunger and malnutrition, with numerous fatalities reported. Climate change significantly impacts agricultural land suitability, potentially leading to severe food shortages and subsequent social and political conflicts. To address this pressing issue, we have developed a machine learning-based approach to predict the risk of substantial land suitability degradation and changes in irrigation patterns. Our study focuses on Central Eurasia, a region burdened with economic and social challenges. This study represents a pioneering effort in utilizing machine learning methods to assess the impact of climate change on agricultural land suitability under various carbon emissions scenarios. Through comprehensive feature importance analysis, we unveil specific climate and terrain characteristics that exert influence on land suitability. Our approach achieves remarkable accuracy, offering policymakers invaluable insights to facilitate informed decisions aimed at averting a humanitarian crisis, including strategies such as the provision of additional water and fertilizers. This research underscores the tremendous potential of machine learning in addressing global challenges, with a particular emphasis on mitigating hunger and malnutrition.
Self-Training: A Survey
Amini, Massih-Reza, Feofanov, Vasilii, Pauletto, Loic, Hadjadj, Lies, Devijver, Emilie, Maximov, Yury
Semi-supervised algorithms aim to learn prediction functions from a small set of labeled observations and a large set of unlabeled observations. Because this framework is relevant in many applications, they have received a lot of interest in both academia and industry. Among the existing techniques, self-training methods have undoubtedly attracted greater attention in recent years. These models are designed to find the decision boundary on low density regions without making additional assumptions about the data distribution, and use the unsigned output score of a learned classifier, or its margin, as an indicator of confidence. The working principle of self-training algorithms is to learn a classifier iteratively by assigning pseudo-labels to the set of unlabeled training samples with a margin greater than a certain threshold. The pseudo-labeled examples are then used to enrich the labeled training data and to train a new classifier in conjunction with the labeled training set. In this paper, we present self-training methods for binary and multi-class classification; as well as their variants and two related approaches, namely consistency-based approaches and transductive learning. We examine the impact of significant self-training features on various methods, using different general and image classification benchmarks, and we discuss our ideas for future research in self-training. To the best of our knowledge, this is the first thorough and complete survey on this subject.
GP CC-OPF: Gaussian Process based optimization tool for Chance-Constrained Optimal Power Flow
Mitrovic, Mile, Kundacina, Ognjen, Lukashevich, Aleksandr, Vorobev, Petr, Terzija, Vladimir, Maximov, Yury, Deka, Deepjyoti
As an optimization tool, the OPF is typically used to solve the Economic dispatch (ED) problem by finding the optimal output of the controllable generators with the lowest possible cost that meets the load and physical constraints of the grid. However, the OPF is a complex non-linear problem with many constraints that can be hard to solve. In addition, the rapid integration of renewable energy resources (RES) with intermittent outputs propagates uncertainty through the grid and thus leads to a higher degree of complexity in power grid operations. To take into account the impacts of uncertainty within the OPF, the researchers have recently proposed several stochastic approaches such as robust optimization [1], probabilistic OPF [2], and Chance-Constrained (CC) OPF [3, 4]. Robust optimization often leads to conservative solutions, while probabilistic OPF is difficult to implement in practice. The CC-OPF implies satisfying probability constraints with a given acceptable violation probability, balancing operating costs and security in the power grid in that way.
Data-Driven Stochastic AC-OPF using Gaussian Processes
Mitrovic, Mile, Lukashevich, Aleksandr, Vorobev, Petr, Terzija, Vladimir, Budenny, Semen, Maximov, Yury, Deka, Deepjyoti
In recent years, electricity generation has been responsible for more than a quarter of the greenhouse gas emissions in the US. Integrating a significant amount of renewables into a power grid is probably the most accessible way to reduce carbon emissions from power grids and slow down climate change. Unfortunately, the most accessible renewable power sources, such as wind and solar, are highly fluctuating and thus bring a lot of uncertainty to power grid operations and challenge existing optimization and control policies. The chance-constrained alternating current (AC) optimal power flow (OPF) framework finds the minimum cost generation dispatch maintaining the power grid operations within security limits with a prescribed probability. Unfortunately, the AC-OPF problem's chance-constrained extension is non-convex, computationally challenging, and requires knowledge of system parameters and additional assumptions on the behavior of renewable distribution. Known linear and convex approximations to the above problems, though tractable, are too conservative for operational practice and do not consider uncertainty in system parameters. This paper presents an alternative data-driven approach based on Gaussian process (GP) regression to close this gap. The GP approach learns a simple yet non-convex data-driven approximation to the AC power flow equations that can incorporate uncertainty inputs. The latter is then used to determine the solution of CC-OPF efficiently, by accounting for both input and parameter uncertainty. The practical efficiency of the proposed approach using different approximations for GP-uncertainty propagation is illustrated over numerous IEEE test cases.
Tractable Minor-free Generalization of Planar Zero-field Ising Models
Likhosherstov, Valerii, Maximov, Yury, Chertkov, Michael
We present a new family of zero-field Ising models over $N$ binary variables/spins obtained by consecutive "gluing" of planar and $O(1)$-sized components and subsets of at most three vertices into a tree. The polynomial-time algorithm of the dynamic programming type for solving exact inference (computing partition function) and exact sampling (generating i.i.d. samples) consists in a sequential application of an efficient (for planar) or brute-force (for $O(1)$-sized) inference and sampling to the components as a black box. To illustrate the utility of the new family of tractable graphical models, we first build a polynomial algorithm for inference and sampling of zero-field Ising models over $K_{3,3}$-minor-free topologies and over $K_{5}$-minor-free topologies -- both are extensions of the planar zero-field Ising models -- which are neither genus - nor treewidth-bounded. Second, we demonstrate empirically an improvement in the approximation quality of the NP-hard problem of inference over the square-grid Ising model in a node-dependent non-zero "magnetic" field.
Sequential Learning over Implicit Feedback for Robust Large-Scale Recommender Systems
Burashnikova, Alexandra, Maximov, Yury, Amini, Massih-Reza
In this paper, we propose a robust sequential learning strategy for training large-scale Recommender Systems (RS) over implicit feedback mainly in the form of clicks. Our approach relies on the minimization of a pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. Parameter updates are discarded if for a given user the number of sequential blocks is below or above some given thresholds estimated over the distribution of the number of blocks in the training set. This is to prevent from an abnormal number of clicks over some targeted items, mainly due to bots; or very few user interactions. Both scenarios affect the decision of RS and imply a shift over the distribution of items that are shown to the users. We provide a theoretical analysis showing that in the case where the ranking loss is convex, the deviation between the loss with respect to the sequence of weights found by the proposed algorithm and its minimum is bounded. Furthermore, experimental results on five large-scale collections demonstrate the efficiency of the proposed algorithm with respect to the state-of-the-art approaches, both regarding different ranking measures and computation time.
Learning a Generator Model from Terminal Bus Data
Stulov, Nikolay, Sobajic, Dejan J, Maximov, Yury, Deka, Deepjyoti, Chertkov, Michael
Abstract--In this work we investigate approaches to reconstruct generator models from measurements available at the generator terminal bus using machine learning (ML) techniques. The goal is to develop an emulator which is trained online and is capable of fast predictive computations. The training is illustrated on synthetic data generated based on available open-source dynamical generator model. Two ML techniques were developed and tested: (a) standard vector auto-regressive (VAR) model; and (b) novel customized long short-term memory (LSTM) deep learning model. Tradeoffs in reconstruction ability between computationally light but linear AR model and powerful but computationally demanding LSTM model are established and analyzed.
Inference and Sampling of $K_{33}$-free Ising Models
Likhosherstov, Valerii, Maximov, Yury, Chertkov, Michael
We call an Ising model tractable when it is possible to compute its partition function value (statistical inference) in polynomial time. The tractability also implies an ability to sample configurations of this model in polynomial time. The notion of tractability extends the basic case of planar zero-field Ising models. Our starting point is to describe algorithms for the basic case computing partition function and sampling efficiently. To derive the algorithms, we use an equivalent linear transition to perfect matching counting and sampling on an expanded dual graph. Then, we extend our tractable inference and sampling algorithms to models, whose triconnected components are either planar or graphs of $O(1)$ size. In particular, it results in a polynomial-time inference and sampling algorithms for $K_{33}$ (minor) free topologies of zero-field Ising models - a generalization of planar graphs with a potentially unbounded genus.