Energy
Mean-field inference methods for neural networks
Machine learning algorithms relying on deep neural networks recently allowed a great leap forward in artificial intelligence. Despite the popularity of their applications, the efficiency of these algorithms remains largely unexplained from a theoretical point of view. The mathematical description of learning problems involves very large collections of interacting random variables, difficult to handle analytically as well as numerically. This complexity is precisely the object of study of statistical physics. Its mission, originally pointed towards natural systems, is to understand how macroscopic behaviors arise from microscopic laws. Mean-field methods are one type of approximation strategy developed in this view. We review a selection of classical mean-field methods and recent progress relevant for inference in neural networks. In particular, we remind the principles of derivations of high-temperature expansions, the replica method and message passing algorithms, highlighting their equivalences and complementarities. We also provide references for past and current directions of research on neural networks relying on mean-field methods.
Computationally efficient versions of conformal predictive distributions
Vovk, Vladimir, Petej, Ivan, Nouretdinov, Ilia, Manokhin, Valery, Gammerman, Alex
Conformal predictive systems are a recent modification of conformal predictors that output, in regression problems, probability distributions for labels of test observations rather than set predictions. The extra information provided by conformal predictive systems may be useful, e.g., in decision making problems. Conformal predictive systems inherit the relative computational inefficiency of conformal predictors. In this paper we discuss two computationally efficient versions of conformal predictive systems, which we call split conformal predictive systems and cross-conformal predictive systems. The main advantage of split conformal predictive systems is their guaranteed validity, whereas for cross-conformal predictive systems validity only holds empirically and in the absence of excessive randomization. The main advantage of cross-conformal predictive systems is their greater predictive efficiency.
Seasonally-Adjusted Auto-Regression of Vector Time Series
We present a simple algorithm to forecast vector time series, that is robust against missing data, in both training and inference. It models seasonal annual, weekly, and daily baselines, and a Gaussian process for the seasonally-adjusted residuals. We develop a custom truncated eigendecomposition to fit a low-rank plus block-diagonal Gaussian kernel. Inference is performed with the Schur complement, using Tikhonov regularization to prevent overfit, and the Woodbury formula to invert sub-matrices of the kernel efficiently. Inference requires an amount of memory and computation linear in the dimension of the time series, and so the model can scale to very large datasets. We also propose a simple "greedy" grid search for automatic hyper-parameter tuning. The paper is accompanied by tsar (i.e., time series auto-regressor), a Python library that implements the algorithm.
Big Tech Is Making A Massive Bet On AI … Here's How Investors Can, Too
Artificial intelligence is becoming the future of everything. Yet, only a few large companies have the talent and the technology to perfect it. That's the gist of New York Times story published late last week. Rising costs for AI research are locking out university researchers and garage entrepreneurs, two of the traditional -- and historically best -- founts of innovation. In the past, software engineers used code to build platforms and new business models.
PAID POST by IBM -- A.I.: Your Next Career Move
Australia's Woodside Energy had decades of knowledge instilled in its retiring workforce of oil pipeline engineers and wanted to scale that wealth to a much more expansive workforce. When Shelley Kalms, Woodside's Chief Digital Officer, describes her mission to transform her company into a "true learning organization," she's speaking about a "team effort in unlocking the collective intelligence of our organization -- both past and present." Through using IBM Watson, Woodside created the ability to have expansive knowledge at their fingertips. Today's incoming employees in the field have access to as much know-how as a person who has worked on the oil rigs for decades. Today, Woodside has 18 successful Watson-enabled A.I. projects in place, ranging from health to safety -- all of which presents compelling evidence of the potential of A.I. to optimize jobs and increase productivity.
How Machine Learning Could Impact the Future of Renewable Energy
But as renewable energy technologies like wind farms are implemented at larger scales than ever, local officials are running into their limitations. The energy production of wind farms is hard to predict, and this makes energy grid design difficult. Experts hope that machine learning can be applied to renewable energy to solve this problem. If it works, this new tech may make energy officials more enthusiastic about implementing renewables. One downside of renewables is how hard it can be to predict the energy they produce. Wind speeds can vary widely from hour to hour and from day to day.
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration
Zanette, Andrea, Brandfonbrener, David, Pirotta, Matteo, Lazaric, Alessandro
A key challenge in reinforcement learning (RL) is how to bala nce exploration and exploitation in order to efficiently learn to make good sequences of decisions in a way that is both computationally tractable and statistically efficient. In the tabular case, the exploration-exploitation problem is well-understood for a number of settings (e.g., finite-hori zon, average reward, infinite horizon with discount), explorati on objectives (e.g., regret minimization and probably approximately correct), and for different algorithmic appro aches, where optimism-under-uncertainty [JOA10, FPLO18] and Thompson sampling (TS) [OBPVR16, Rus19] are the most pop ular principles. For instance, in the finite-horizon setting, [AOM17] and [ZB19] recently derived minimax optim al and structure adaptive regret bounds for optimistic exploration algorithms. TSbased algorithms have mainly b een analyzed in tabular MDPs in terms of Bayesian regret [OBPVR16, OR17, OGNJ17], which assumes that the MDP is s ampled from a known prior distribution. These bounds do not hold against a fixed MDP and algorithms with smal l Bayesian regret may still suffer high regret in some hard-to-learn MDPs within the chosen prior. In the tabu lar setting, frequentist (or worst-case) regret analysis h as been developed for TSbased algorithms both in the average r eward [GM15, AJ17] and finite-horizon case [Rus19]. Despite the fact that TSbased approaches have slightly wor se regret bounds compared to optimism-based algorithms, their empirical performance is often superior [CL11, OR17] . Unfortunately, the performance of tabular exploration met hods rapidly degrades with the number of states and actions, thus making them unfeasible in large or continuous MD Ps.
Machine Learning for high speed channel optimization
He, Jiayi, Kumar, Aravind Sampath, Chada, Arun, Mutnury, Bhyrav, Drewniak, James
-- Design of printed circuit board (PCB) stack - up requires the consideration of characteristic impedance, insertion loss and crosstalk. As there are many parameters in a PCB stack - up design, the optimization of these parameters needs to be efficient and accurate. A le ss optimal stack - up would lead to expensive PCB material choices in high speed designs. In this paper, a n efficient global optimization method using parallel and intelligent Bayesian optimization is proposed for the stripline design . In high speed system design, optimizing printed circuit board (PCB) stack - up is playing a more and more important role in design stage.
Aerodynamic Data Fusion Towards the Digital Twin Paradigm
Renganathan, S. Ashwin, Harada, Kohei, Mavris, Dimitri N.
We consider the fusion of two aerodynamic data sets originating from differing fidelity physical or computer experiments. We specifically address the fusion of: 1) noisy and in-complete fields from wind tunnel measurements and 2) deterministic but biased fields from numerical simulations. These two data sources are fused in order to estimate the \emph{true} field that best matches measured quantities that serves as the ground truth. For example, two sources of pressure fields about an aircraft are fused based on measured forces and moments from a wind-tunnel experiment. A fundamental challenge in this problem is that the true field is unknown and can not be estimated with 100\% certainty. We employ a Bayesian framework to infer the true fields conditioned on measured quantities of interest; essentially we perform a \emph{statistical correction} to the data. The fused data may then be used to construct more accurate surrogate models suitable for early stages of aerospace design. We also introduce an extension of the Proper Orthogonal Decomposition with constraints to solve the same problem. Both methods are demonstrated on fusing the pressure distributions for flow past the RAE2822 airfoil and the Common Research Model wing at transonic conditions. Comparison of both methods reveal that the Bayesian method is more robust when data is scarce while capable of also accounting for uncertainties in the data. Furthermore, given adequate data, the POD based and Bayesian approaches lead to \emph{similar} results.
Learning Deep Bayesian Latent Variable Regression Models that Generalize: When Non-identifiability is a Problem
Yacoby, Yaniv, Pan, Weiwei, Doshi-Velez, Finale
Bayesian Neural Networks with Latent Variables (BNN+LV's) provide uncertainties in prediction estimates by explicitly modeling model uncertainty (via priors on network weights) and environmental stochasticity (via a latent input noise variable). In this work, we first show that BNN+LV suffers from a serious form of non-identifiability: explanatory power can be transferred between model parameters and input noise while fitting the data equally well. We demonstrate that, as a result, traditional inference methods may yield parameters that reconstruct observed data well but generalize poorly. Next, we develop a novel inference procedure that explicitly mitigates the effects of likelihood non-identifiability during training and yields high quality predictions as well as uncertainty estimates. We demonstrate that our inference method improves upon benchmark methods across a range of synthetic and real datasets.