AITopics

2104.05959

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.16)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.47)
Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kim, Joanne T., Larma, Mikel Landajuela, Petersen, Brenden K.

Distilling Wikipedia mathematical knowledge into neural network models

arXiv.org Artificial IntelligenceApr-13-2021

Machine learning applications to symbolic mathematics are becoming increasingly popular, yet there lacks a centralized source of real-world symbolic expressions to be used as training data. In contrast, the field of natural language processing leverages resources like Wikipedia that provide enormous amounts of realworld textual data. Adopting the philosophy of "mathematics as language," we bridge this gap by introducing a pipeline for distilling mathematical expressions embedded in Wikipedia into symbolic encodings to be used in downstream machine learning tasks. We demonstrate that a mathematical language model trained on this "corpus" of expressions can be used as a prior to improve the performance of neural-guided search for the task of symbolic regression. "The basis of all human culture is language, and mathematics is a special kind of linguistic activity."

expression, mathematical expression, symbolic regression, (13 more...)

2104.0593

Country:

North America > United States > California > Alameda County > Livermore (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.50)

Industry:

Energy (0.48)
Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer

Lyu, Yiwei, Liang, Paul Pu, Pham, Hai, Hovy, Eduard, Póczos, Barnabás, Salakhutdinov, Ruslan, Morency, Louis-Philippe

Text style transfer aims to controllably generate text with targeted stylistic changes while maintaining core meaning from the source sentence constant. Many of the existing style transfer benchmarks primarily focus on individual high-level semantic changes (e.g. positive to negative), which enable controllability at a high level but do not offer fine-grained control involving sentence structure, emphasis, and content of the sentence. In this paper, we introduce a large-scale benchmark, StylePTB, with (1) paired sentences undergoing 21 fine-grained stylistic changes spanning atomic lexical, syntactic, semantic, and thematic transfers of text, as well as (2) compositions of multiple transfers which allow modeling of fine-grained stylistic changes as building blocks for more complex, high-level transfers. By benchmarking existing methods on StylePTB, we find that they struggle to model fine-grained changes and have an even more difficult time composing multiple styles. As a result, StylePTB brings novel challenges that we hope will encourage future research in controllable text style transfer, compositional models, and learning disentangled representations. Solving these challenges would present important steps towards controllable text generation.

health & medicine, style transfer, text processing, (20 more...)

2104.05196

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.93)
Energy > Oil & Gas (0.67)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Chandra, Rohitash, Jain, Mahir, Maharana, Manavendra, Krivitsky, Pavel N.

Revisiting Bayesian Autoencoders with MCMC

Bayes' theorem is used as foundation Autoencoders are a family of unsupervised learning methods for inference in Bayesian neural networks, and Markov that use neural network architectures and learning algorithms chain Monte Carlo (MCMC) sampling methods [25] are used to learn a lower-dimensional representation (encoding) for constructing the posterior distribution. Variational inference of the data, which can then be used to reconstruct a representation [26] is another way to approximate the posterior distribution, close to the original input. They thus facilitate dimensionality which approximates an intractable posterior distribution by a reduction for prediction and classification [1, 2], and have tractable one. This makes it particularly suited to large data been successfully applied to image classification [3, 4], face sets and models, and so it has been popular for autoencoders recognition [5, 6], geoscience and remote sensing [7], speechbased and neural networks [13, 27].

autoencoder, dataset, neural network, (14 more...)

2104.05915

Country:

Oceania > Australia > New South Wales > Kensington (0.04)
Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
Asia > Japan (0.04)
Asia > India (0.04)

Genre:

Overview (0.67)
Research Report (0.64)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)

Liu, Haotian, Wu, Wenchuan

Bi-level Off-policy Reinforcement Learning for Volt/VAR Control Involving Continuous and Discrete Devices

In Volt/Var control (VVC) of active distribution networks(ADNs), both slow timescale discrete devices (STDDs) and fast timescale continuous devices (FTCDs) are involved. The STDDs such as on-load tap changers (OLTC) and FTCDs such as distributed generators should be coordinated in time sequence. Such VCC is formulated as a two-timescale optimization problem to jointly optimize FTCDs and STDDs in ADNs. Traditional optimization methods are heavily based on accurate models of the system, but sometimes impractical because of their unaffordable effort on modelling. In this paper, a novel bi-level off-policy reinforcement learning (RL) algorithm is proposed to solve this problem in a model-free manner. A Bi-level Markov decision process (BMDP) is defined to describe the two-timescale VVC problem and separate agents are set up for the slow and fast timescale sub-problems. For the fast timescale sub-problem, we adopt an off-policy RL method soft actor-critic with high sample efficiency. For the slow one, we develop an off-policy multi-discrete soft actor-critic (MDSAC) algorithm to address the curse of dimensionality with various STDDs. To mitigate the non-stationary issue existing the two agents' learning processes, we propose a multi-timescale off-policy correction (MTOPC) method by adopting importance sampling technique. Comprehensive numerical studies not only demonstrate that the proposed method can achieve stable and satisfactory optimization of both STDDs and FTCDs without any model information, but also support that the proposed method outperforms existing two-timescale VVC methods.

algorithm, reinforcement learning, timescale, (14 more...)

2104.05902

Country:

Europe (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.40)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Berger, Victor, Sebag, Michele

Boltzmann Tuning of Generative Models

The paper focuses on the a posteriori tuning of a generative model in order to favor the generation of good instances in the sense of some external differentiable criterion. The proposed approach, called Boltzmann Tuning of Generative Models (BTGM), applies to a wide range of applications. It covers conditional generative modelling as a particular case, and offers an affordable alternative to rejection sampling. The contribution of the paper is twofold. Firstly, the objective is formalized and tackled as a well-posed optimization problem; a practical methodology is proposed to choose among the candidate criteria representing the same goal, the one best suited to efficiently learn a tuned generative model. Secondly, the merits of the approach are demonstrated on a real-world application, in the context of robust design for energy policies, showing the ability of BTGM to sample the extreme regions of the considered criteria.

arxiv, generative model, machine learning, (14 more...)

2104.05252

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(5 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Research Report (0.64)

Industry: Energy > Power Industry (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

arXiv.org Artificial IntelligenceApr-11-2021

Uncover Residential Energy Consumption Patterns Using Socioeconomic and Smart Meter Data

Tang, Wenjun, Wang, Hao, Lee, Xian-Long, Yang, Hong-Tzer

This paper models residential consumers' energy-consumption behavior by load patterns and distributions and reveals the relationship between consumers' load patterns and socioeconomic features by machine learning. We analyze the real-world smart meter data and extract load patterns using K-Medoids clustering, which is robust to outliers. We develop an analytical framework with feature selection and deep learning models to estimate the relationship between load patterns and socioeconomic features. Specifically, we use an entropy-based feature selection method to identify the critical socioeconomic characteristics that affect load patterns and benefit our method's interpretability. We further develop a customized deep neural network model to characterize the relationship between consumers' load patterns and selected socioeconomic features. Numerical studies validate our proposed framework using Pecan Street smart meter data and survey. We demonstrate that our framework can capture the relationship between load patterns and socioeconomic information and outperform benchmarks such as regression and single DNN models.

load pattern, load profile, socioeconomic factor, (14 more...)

2104.05154

Country:

Europe (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States (0.04)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceApr-11-2021

Fast Design Space Exploration of Nonlinear Systems: Part I

Narain, Sanjai, Mak, Emily, Chee, Dana, Englot, Brendan, Pochiraju, Kishore, Jha, Niraj K., Narayan, Karthik

System design tools are often only available as blackboxes with complex nonlinear relationships between inputs and outputs. Blackboxes typically run in the forward direction: for a given design as input they compute an output representing system behavior. Most cannot be run in reverse to produce an input from requirements on output. Thus, finding a design satisfying a requirement is often a trial-and-error process without assurance of optimality. Finding designs concurrently satisfying multiple requirements is harder because designs satisfying individual requirements may conflict with each other. Compounding the hardness are the facts that blackbox evaluations can be expensive and sometimes fail to produce an output due to non-convergence of underlying numerical algorithms. This paper presents CNMA (Constrained optimization with Neural networks, MILP solvers and Active Learning), a new optimization method for blackboxes. It is conservative in the number of blackbox evaluations. Any designs it finds are guaranteed to satisfy all requirements. It is resilient to the failure of blackboxes to compute outputs. It tries to sample only the part of the design space relevant to solving the design problem, leveraging the power of neural networks, MILPs, and a new learning-from-failure feedback loop. The paper also presents parallel CNMA that improves the efficiency and quality of solutions over the sequential version, and tries to steer it away from local optima. CNMA's performance is evaluated for seven nonlinear design problems of 8 (2 problems), 10, 15, 36 and 60 real-valued dimensions and one with 186 binary dimensions. It is shown that CNMA improves the performance of stable, off-the-shelf implementations of Bayesian Optimization and Nelder Mead and Random Search by 1%-87% for a given fixed time and function evaluation budget. Note, that these implementations did not always return solutions.

cnma, constraint, neural network, (17 more...)

2104.01747

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Energy (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceApr-10-2021

SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000x Fewer Labels

Hu, Qingyong, Yang, Bo, Fang, Guangchi, Guo, Yulan, Leonardis, Ales, Trigoni, Niki, Markham, Andrew

We study the problem of labelling effort for semantic segmentation of large-scale 3D point clouds. Existing works usually rely on densely annotated point-level semantic labels to provide supervision for network training. However, in real-world scenarios that contain billions of points, it is impractical and extremely costly to manually annotate every single point. In this paper, we first investigate whether dense 3D labels are truly required for learning meaningful semantic representations. Interestingly, we find that the segmentation performance of existing works only drops slightly given as few as 1% of the annotations. However, beyond this point (e.g. 1 per thousand and below) existing techniques fail catastrophically. To this end, we propose a new weak supervision method to implicitly augment the total amount of available supervision signals, by leveraging the semantic similarity between neighboring points. Extensive experiments demonstrate that the proposed Semantic Query Network (SQN) achieves state-of-the-art performance on six large-scale open datasets under weak supervision schemes, while requiring only 1000x fewer labeled points for training. The code is available at https://github.com/QingyongHu/SQN.

deep learning, neural network, segmentation, (23 more...)

2104.04891

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas (0.67)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningApr-10-2021

What Makes an Effective Scalarising Function for Multi-Objective Bayesian Optimisation?

Stock-Williams, Clym, Chugh, Tinkle, Rahat, Alma, Yu, Wei

Performing multi-objective Bayesian optimisation by scalarising the objectives avoids the computation of expensive multi-dimensional integral-based acquisition functions, instead of allowing one-dimensional standard acquisition functions\textemdash such as Expected Improvement\textemdash to be applied. Here, two infill criteria based on hypervolume improvement\textemdash one recently introduced and one novel\textemdash are compared with the multi-surrogate Expected Hypervolume Improvement. The reasons for the disparities in these methods' effectiveness in maximising the hypervolume of the acquired Pareto Front are investigated. In addition, the effect of the surrogate model mean function on exploration and exploitation is examined: careful choice of data normalisation is shown to be preferable to the exploration parameter commonly used with the Expected Improvement acquisition function. Finally, the effectiveness of all the methodological improvements defined here is demonstrated on a real-world problem: the optimisation of a wind turbine blade aerofoil for both aerodynamic performance and structural stiffness. With effective scalarisation, Bayesian optimisation finds a large number of new aerofoil shapes that strongly dominate standard designs.

optimisation, renewable energy, upstream oil & gas, (18 more...)

arXiv.org Machine Learning

2104.0479

Country:

Europe > Netherlands (0.14)
North America > Canada (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry:

Energy > Renewable > Wind (0.36)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)