Energy
AutoOED: Automated Optimal Experiment Design Platform
Tian, Yunsheng, Luković, Mina Konaković, Erps, Timothy, Foshey, Michael, Matusik, Wojciech
We present AutoOED, an Optimal Experiment Design platform powered with automated machine learning to accelerate the discovery of optimal solutions. The platform solves multi-objective optimization problems in time- and data-efficient manner by automatically guiding the design of experiments to be evaluated. To automate the optimization process, we implement several multi-objective Bayesian optimization algorithms with state-of-the-art performance. AutoOED is open-source and written in Python. The codebase is modular, facilitating extensions and tailoring the code, serving as a testbed for machine learning researchers to easily develop and evaluate their own multi-objective Bayesian optimization algorithms. An intuitive graphical user interface (GUI) is provided to visualize and guide the experiments for users with little or no experience with coding, machine learning, or optimization. Furthermore, a distributed system is integrated to enable parallelized experimental evaluations by independent workers in remote locations. The platform is available at https://autooed.org.
Distilling Wikipedia mathematical knowledge into neural network models
Kim, Joanne T., Larma, Mikel Landajuela, Petersen, Brenden K.
Machine learning applications to symbolic mathematics are becoming increasingly popular, yet there lacks a centralized source of real-world symbolic expressions to be used as training data. In contrast, the field of natural language processing leverages resources like Wikipedia that provide enormous amounts of realworld textual data. Adopting the philosophy of "mathematics as language," we bridge this gap by introducing a pipeline for distilling mathematical expressions embedded in Wikipedia into symbolic encodings to be used in downstream machine learning tasks. We demonstrate that a mathematical language model trained on this "corpus" of expressions can be used as a prior to improve the performance of neural-guided search for the task of symbolic regression. "The basis of all human culture is language, and mathematics is a special kind of linguistic activity."
StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer
Lyu, Yiwei, Liang, Paul Pu, Pham, Hai, Hovy, Eduard, Póczos, Barnabás, Salakhutdinov, Ruslan, Morency, Louis-Philippe
Text style transfer aims to controllably generate text with targeted stylistic changes while maintaining core meaning from the source sentence constant. Many of the existing style transfer benchmarks primarily focus on individual high-level semantic changes (e.g. positive to negative), which enable controllability at a high level but do not offer fine-grained control involving sentence structure, emphasis, and content of the sentence. In this paper, we introduce a large-scale benchmark, StylePTB, with (1) paired sentences undergoing 21 fine-grained stylistic changes spanning atomic lexical, syntactic, semantic, and thematic transfers of text, as well as (2) compositions of multiple transfers which allow modeling of fine-grained stylistic changes as building blocks for more complex, high-level transfers. By benchmarking existing methods on StylePTB, we find that they struggle to model fine-grained changes and have an even more difficult time composing multiple styles. As a result, StylePTB brings novel challenges that we hope will encourage future research in controllable text style transfer, compositional models, and learning disentangled representations. Solving these challenges would present important steps towards controllable text generation.
Revisiting Bayesian Autoencoders with MCMC
Chandra, Rohitash, Jain, Mahir, Maharana, Manavendra, Krivitsky, Pavel N.
Bayes' theorem is used as foundation Autoencoders are a family of unsupervised learning methods for inference in Bayesian neural networks, and Markov that use neural network architectures and learning algorithms chain Monte Carlo (MCMC) sampling methods [25] are used to learn a lower-dimensional representation (encoding) for constructing the posterior distribution. Variational inference of the data, which can then be used to reconstruct a representation [26] is another way to approximate the posterior distribution, close to the original input. They thus facilitate dimensionality which approximates an intractable posterior distribution by a reduction for prediction and classification [1, 2], and have tractable one. This makes it particularly suited to large data been successfully applied to image classification [3, 4], face sets and models, and so it has been popular for autoencoders recognition [5, 6], geoscience and remote sensing [7], speechbased and neural networks [13, 27].
Bi-level Off-policy Reinforcement Learning for Volt/VAR Control Involving Continuous and Discrete Devices
In Volt/Var control (VVC) of active distribution networks(ADNs), both slow timescale discrete devices (STDDs) and fast timescale continuous devices (FTCDs) are involved. The STDDs such as on-load tap changers (OLTC) and FTCDs such as distributed generators should be coordinated in time sequence. Such VCC is formulated as a two-timescale optimization problem to jointly optimize FTCDs and STDDs in ADNs. Traditional optimization methods are heavily based on accurate models of the system, but sometimes impractical because of their unaffordable effort on modelling. In this paper, a novel bi-level off-policy reinforcement learning (RL) algorithm is proposed to solve this problem in a model-free manner. A Bi-level Markov decision process (BMDP) is defined to describe the two-timescale VVC problem and separate agents are set up for the slow and fast timescale sub-problems. For the fast timescale sub-problem, we adopt an off-policy RL method soft actor-critic with high sample efficiency. For the slow one, we develop an off-policy multi-discrete soft actor-critic (MDSAC) algorithm to address the curse of dimensionality with various STDDs. To mitigate the non-stationary issue existing the two agents' learning processes, we propose a multi-timescale off-policy correction (MTOPC) method by adopting importance sampling technique. Comprehensive numerical studies not only demonstrate that the proposed method can achieve stable and satisfactory optimization of both STDDs and FTCDs without any model information, but also support that the proposed method outperforms existing two-timescale VVC methods.
Boltzmann Tuning of Generative Models
Berger, Victor, Sebag, Michele
The paper focuses on the a posteriori tuning of a generative model in order to favor the generation of good instances in the sense of some external differentiable criterion. The proposed approach, called Boltzmann Tuning of Generative Models (BTGM), applies to a wide range of applications. It covers conditional generative modelling as a particular case, and offers an affordable alternative to rejection sampling. The contribution of the paper is twofold. Firstly, the objective is formalized and tackled as a well-posed optimization problem; a practical methodology is proposed to choose among the candidate criteria representing the same goal, the one best suited to efficiently learn a tuned generative model. Secondly, the merits of the approach are demonstrated on a real-world application, in the context of robust design for energy policies, showing the ability of BTGM to sample the extreme regions of the considered criteria.
Uncover Residential Energy Consumption Patterns Using Socioeconomic and Smart Meter Data
Tang, Wenjun, Wang, Hao, Lee, Xian-Long, Yang, Hong-Tzer
This paper models residential consumers' energy-consumption behavior by load patterns and distributions and reveals the relationship between consumers' load patterns and socioeconomic features by machine learning. We analyze the real-world smart meter data and extract load patterns using K-Medoids clustering, which is robust to outliers. We develop an analytical framework with feature selection and deep learning models to estimate the relationship between load patterns and socioeconomic features. Specifically, we use an entropy-based feature selection method to identify the critical socioeconomic characteristics that affect load patterns and benefit our method's interpretability. We further develop a customized deep neural network model to characterize the relationship between consumers' load patterns and selected socioeconomic features. Numerical studies validate our proposed framework using Pecan Street smart meter data and survey. We demonstrate that our framework can capture the relationship between load patterns and socioeconomic information and outperform benchmarks such as regression and single DNN models.
Fast Design Space Exploration of Nonlinear Systems: Part I
Narain, Sanjai, Mak, Emily, Chee, Dana, Englot, Brendan, Pochiraju, Kishore, Jha, Niraj K., Narayan, Karthik
System design tools are often only available as blackboxes with complex nonlinear relationships between inputs and outputs. Blackboxes typically run in the forward direction: for a given design as input they compute an output representing system behavior. Most cannot be run in reverse to produce an input from requirements on output. Thus, finding a design satisfying a requirement is often a trial-and-error process without assurance of optimality. Finding designs concurrently satisfying multiple requirements is harder because designs satisfying individual requirements may conflict with each other. Compounding the hardness are the facts that blackbox evaluations can be expensive and sometimes fail to produce an output due to non-convergence of underlying numerical algorithms. This paper presents CNMA (Constrained optimization with Neural networks, MILP solvers and Active Learning), a new optimization method for blackboxes. It is conservative in the number of blackbox evaluations. Any designs it finds are guaranteed to satisfy all requirements. It is resilient to the failure of blackboxes to compute outputs. It tries to sample only the part of the design space relevant to solving the design problem, leveraging the power of neural networks, MILPs, and a new learning-from-failure feedback loop. The paper also presents parallel CNMA that improves the efficiency and quality of solutions over the sequential version, and tries to steer it away from local optima. CNMA's performance is evaluated for seven nonlinear design problems of 8 (2 problems), 10, 15, 36 and 60 real-valued dimensions and one with 186 binary dimensions. It is shown that CNMA improves the performance of stable, off-the-shelf implementations of Bayesian Optimization and Nelder Mead and Random Search by 1%-87% for a given fixed time and function evaluation budget. Note, that these implementations did not always return solutions.
SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000x Fewer Labels
Hu, Qingyong, Yang, Bo, Fang, Guangchi, Guo, Yulan, Leonardis, Ales, Trigoni, Niki, Markham, Andrew
We study the problem of labelling effort for semantic segmentation of large-scale 3D point clouds. Existing works usually rely on densely annotated point-level semantic labels to provide supervision for network training. However, in real-world scenarios that contain billions of points, it is impractical and extremely costly to manually annotate every single point. In this paper, we first investigate whether dense 3D labels are truly required for learning meaningful semantic representations. Interestingly, we find that the segmentation performance of existing works only drops slightly given as few as 1% of the annotations. However, beyond this point (e.g. 1 per thousand and below) existing techniques fail catastrophically. To this end, we propose a new weak supervision method to implicitly augment the total amount of available supervision signals, by leveraging the semantic similarity between neighboring points. Extensive experiments demonstrate that the proposed Semantic Query Network (SQN) achieves state-of-the-art performance on six large-scale open datasets under weak supervision schemes, while requiring only 1000x fewer labeled points for training. The code is available at https://github.com/QingyongHu/SQN.
What Makes an Effective Scalarising Function for Multi-Objective Bayesian Optimisation?
Stock-Williams, Clym, Chugh, Tinkle, Rahat, Alma, Yu, Wei
Performing multi-objective Bayesian optimisation by scalarising the objectives avoids the computation of expensive multi-dimensional integral-based acquisition functions, instead of allowing one-dimensional standard acquisition functions\textemdash such as Expected Improvement\textemdash to be applied. Here, two infill criteria based on hypervolume improvement\textemdash one recently introduced and one novel\textemdash are compared with the multi-surrogate Expected Hypervolume Improvement. The reasons for the disparities in these methods' effectiveness in maximising the hypervolume of the acquired Pareto Front are investigated. In addition, the effect of the surrogate model mean function on exploration and exploitation is examined: careful choice of data normalisation is shown to be preferable to the exploration parameter commonly used with the Expected Improvement acquisition function. Finally, the effectiveness of all the methodological improvements defined here is demonstrated on a real-world problem: the optimisation of a wind turbine blade aerofoil for both aerodynamic performance and structural stiffness. With effective scalarisation, Bayesian optimisation finds a large number of new aerofoil shapes that strongly dominate standard designs.