Goto

Collaborating Authors

 out-of-sample test


Robust Data-Driven Dynamic Programming

Neural Information Processing Systems

In stochastic optimal control the distribution of the exogenous noise is typically unknown and must be inferred from limited data before dynamic programming (DP)-based solution schemes can be applied. If the conditional expectations in the DP recursions are estimated via kernel regression, however, the historical sample paths enter the solution procedure directly as they determine the evaluation points of the cost-to-go functions. The resulting data-driven DP scheme is asymptotically consistent and admits efficient computational solution when combined with parametric value function approximations. If training data is sparse, however, the estimated cost-to-go functions display a high variability and an optimistic bias, while the corresponding control policies perform poorly in out-of-sample tests. To mitigate these small sample effects, we propose a robust data-driven DP scheme, which replaces the expectations in the DP recursions with worst-case expectations over a set of distributions close to the best estimate.


Robust Data-Driven Dynamic Programming

Neural Information Processing Systems

In stochastic optimal control the distribution of the exogenous noise is typically unknown and must be inferred from limited data before dynamic programming (DP)-based solution schemes can be applied. If the conditional expectations in the DP recursions are estimated via kernel regression, however, the historical sample paths enter the solution procedure directly as they determine the evaluation points of the cost-to-go functions. The resulting data-driven DP scheme is asymptotically consistent and admits efficient computational solution when combined with parametric value function approximations. If training data is sparse, however, the estimated cost-to-go functions display a high variability and an optimistic bias, while the corresponding control policies perform poorly in out-of-sample tests. To mitigate these small sample effects, we propose a robust data-driven DP scheme, which replaces the expectations in the DP recursions with worst-case expectations over a set of distributions close to the best estimate.


A Higher Purpose: Measuring Electricity Access Using High-Resolution Daytime Satellite Imagery

arXiv.org Artificial Intelligence

Governments and international organizations the world over are investing towards the goal of achieving universal energy access for improving socio-economic development. However, in developing settings, monitoring electrification efforts is typically inaccurate, infrequent, and expensive. In this work, we develop and present techniques for high-resolution monitoring of electrification progress at scale. Specifically, our 3 unique contributions are: (i) identifying areas with(out) electricity access, (ii) quantifying the extent of electrification in electrified areas (percentage/number of electrified structures), and (iii) differentiating between customer types in electrified regions (estimating the percentage/number of residential/non-residential electrified structures). We combine high-resolution 50 cm daytime satellite images with Convolutional Neural Networks (CNNs) to train a series of classification and regression models. We evaluate our models using unique ground truth datasets on building locations, building types (residential/non-residential), and building electrification status. Our classification models show a 92% accuracy in identifying electrified regions, 85% accuracy in estimating percent of (low/high) electrified buildings within the region, and 69% accuracy in differentiating between (low/high) percentage of electrified residential buildings. Our regressions show $R^2$ scores of 78% and 80% in estimating the number of electrified buildings and number of residential electrified building in images respectively. We also demonstrate the generalizability of our models in never-before-seen regions to assess their potential for consistent and high-resolution measurements of electrification in emerging economies, and conclude by highlighting opportunities for improvement.


Quo Vadis: Hybrid Machine Learning Meta-Model based on Contextual and Behavioral Malware Representations

arXiv.org Artificial Intelligence

We propose a hybrid machine learning architecture that simultaneously employs multiple deep learning models analyzing contextual and behavioral characteristics of Windows portable executable, producing a final prediction based on a decision from the meta-model. The detection heuristic in contemporary machine learning Windows malware classifiers is typically based on the static properties of the sample since dynamic analysis through virtualization is challenging for vast quantities of samples. To surpass this limitation, we employ a Windows kernel emulation that allows the acquisition of behavioral patterns across large corpora with minimal temporal and computational costs. We partner with a security vendor for a collection of more than 100k int-the-wild samples that resemble the contemporary threat landscape, containing raw PE files and filepaths of applications at the moment of execution. The acquired dataset is at least ten folds larger than reported in related works on behavioral malware analysis. Files in the training dataset are labeled by a professional threat intelligence team, utilizing manual and automated reverse engineering tools. We estimate the hybrid classifier's operational utility by collecting an out-of-sample test set three months later from the acquisition of the training set. We report an improved detection rate, above the capabilities of the current state-of-the-art model, especially under low false-positive requirements. Additionally, we uncover a meta-model's ability to identify malicious activity in validation and test sets even if none of the individual models express enough confidence to mark the sample as malevolent. We conclude that the meta-model can learn patterns typical to malicious samples from representation combinations produced by different analysis techniques. We publicly release pre-trained models and anonymized dataset of emulation reports.


Trading Signals In VIX Futures

arXiv.org Machine Learning

We propose a new approach for trading VIX futures. We assume that the term structure of VIX futures follows a Markov model. Our trading strategy selects a position in VIX futures by maximizing the expected utility for a day-ahead horizon given the current shape and level of the term structure. Computationally, we model the functional dependence between the VIX futures curve, the VIX futures positions, and the expected utility as a deep neural network with five hidden layers. Out-of-sample backtests of the VIX futures trading strategy suggest that this approach gives rise to reasonable portfolio performance, and to positions in which the investor will be either long or short VIX futures contracts depending on the market environment.


Robust Data-Driven Dynamic Programming

Neural Information Processing Systems

In stochastic optimal control the distribution of the exogenous noise is typically unknown and must be inferred from limited data before dynamic programming (DP)-based solution schemes can be applied. If the conditional expectations in the DP recursions are estimated via kernel regression, however, the historical sample paths enter the solution procedure directly as they determine the evaluation points of the cost-to-go functions. The resulting data-driven DP scheme is asymptotically consistent and admits efficient computational solution when combined with parametric value function approximations. If training data is sparse, however, the estimated cost-to-go functions display a high variability and an optimistic bias, while the corresponding control policies perform poorly in out-of-sample tests. To mitigate these small sample effects, we propose a robust data-driven DP scheme, which replaces the expectations in the DP recursions with worst-case expectations over a set of distributions close to the best estimate.


Response to Comment on "Predicting reaction performance in C-N cross-coupling using machine learning"

Science

We demonstrate that the chemical-feature model described in our original paper is distinguishable from the nongeneralizable models introduced by Chuang and Keiser. Furthermore, the chemical-feature model significantly outperforms these models in out-of-sample predictions, justifying the use of chemical featurization from which machine learning models can extract meaningful patterns in the dataset, as originally described. In Ahneman et al. (1), we showed that a random forest (RF) algorithm built using computationally derived chemical descriptors for the components of a Pd-catalyzed C–N cross-coupling reaction (aryl halide, ligand, base, and potentially inhibitory isoxazole additive) could identify predictive and meaningful relationships in a multidimensional chemical dataset comprising 4608 reactions. Chuang and Keiser (2) built alternative models using random barcode features ("straw" models), wherein the chemical descriptors are replaced with random numbers selected from a standard normal distribution. One-hot encoded features, wherein each reagent acts as a categorical descriptor and is marked as absent or present, were also evaluated.


Validation Methods For Trading Strategy Development

@machinelearnbot

About the author: Michael Harris is a trader and best selling author. He is also the developer of the first commercial software for identifying parameter-less patterns in price action 17 years ago. In the last seven years he has worked on the development of DLPAL, a software program that can be used to identify short-term anomalies in market data for use with fixed and machine learning models.