sqr
- North America > United States > Tennessee (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Switzerland (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
Multi-Horizon Time Series Forecasting of non-parametric CDFs with Deep Lattice Networks
Erdmann, Niklas, Bentsen, Lars, Stenbro, Roy, Riise, Heine Nygard, Warakagoda, Narada Dilp, Engelstad, Paal E.
Probabilistic forecasting is not only a way to add more information to a prediction of the future, but it also builds on weaknesses in point prediction. Sudden changes in a time series can still be captured by a cumulative distribution function (CDF), while a point prediction is likely to miss it entirely. The modeling of CDFs within forecasts has historically been limited to parametric approaches, but due to recent advances, this no longer has to be the case. We aim to advance the fields of probabilistic forecasting and monotonic networks by connecting them and propose an approach that permits the forecasting of implicit, complete, and nonparametric CDFs. For this purpose, we propose an adaptation to deep lattice networks (DLN) for monotonically constrained simultaneous/implicit quantile regression in time series forecasting. Quantile regression usually produces quantile crossovers, which need to be prevented to achieve a legitimate CDF. By leveraging long short term memory units (LSTM) as the embedding layer, and spreading quantile inputs to all sub-lattices of a DLN with an extended output size, we can produce a multi-horizon forecast of an implicit CDF due to the monotonic constraintability of DLNs that prevent quantile crossovers. We compare and evaluate our approach's performance to relevant state of the art within the context of a highly relevant application of time series forecasting: Day-ahead, hourly forecasts of solar irradiance observations. Our experiments show that the adaptation of a DLN performs just as well or even better than an unconstrained approach. Further comparison of the adapted DLN against a scalable monotonic neural network shows that our approach performs better. With this adaptation of DLNs, we intend to create more interest and crossover investigations in techniques of monotonic neural networks and probabilistic forecasting.
- North America > United States (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- Energy > Power Industry (0.68)
- Energy > Renewable > Solar (0.47)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Tennessee (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Symbolic Quantile Regression for the Interpretable Prediction of Conditional Quantiles
Hoekstra, Cas Oude, Hengst, Floris den
Symbolic Regression (SR) is a well-established framework for generating interpretable or white-box predictive models. Although SR has been successfully applied to create interpretable estimates of the average of the outcome, it is currently not well understood how it can be used to estimate the relationship between variables at other points in the distribution of the target variable. Such estimates of e.g. the median or an extreme value provide a fuller picture of how predictive variables affect the outcome and are necessary in high-stakes, safety-critical application domains. This study introduces Symbolic Quantile Regression (SQR), an approach to predict conditional quantiles with SR. In an extensive evaluation, we find that SQR outperforms transparent models and performs comparably to a strong black-box baseline without compromising transparency. We also show how SQR can be used to explain differences in the target distribution by comparing models that predict extreme and central outcomes in an airline fuel usage case study. We conclude that SQR is suitable for predicting conditional quantiles and understanding interesting feature influences at varying quantiles.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Research Report > Experimental Study (0.67)
- Research Report > New Finding (0.46)
- Transportation > Air (1.00)
- Health & Medicine (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Data Science > Data Mining (0.66)
Reviews: Single-Model Uncertainties for Deep Learning
This work presents ways to obtain estimates of the aleatoric and epistemic uncertainties of deep neural networks. The aleatoric uncertainty is estimated by learning the quantiles of the target variable via Simultaneous Quantile Regression (SQR); it minimizes the pinball loss where the target quantile is randomly sampled in every training iteration. The epistemic uncertainty is implicitly estimated by Orthonormal Certificates (OCs); these are functions that are trained to map in-distribution examples to zero whereas out-of-distribution examples to non-zero values. The authors also provide tail bounds for the OCs in the case of Gaussian input data, which does provide some intuition about the behaviour. Simplicity is a benefit of these estimators and the authors demonstrate their performance on regression and classification tasks.
Using Sequential Statistical Tests to Improve the Performance of Random Search in hyperparameter Tuning
Hyperparamter tuning is one of the the most time-consuming parts in machine learning: The performance of a large number of different hyperparameter settings has to be evaluated to find the best one. Although modern optimization algorithms exist that minimize the number of evaluations needed, the evaluation of a single setting is still expensive: Using a resampling technique, the machine learning method has to be fitted a fixed number of $K$ times on different training data sets. As an estimator for the performance of the setting the respective mean value of the $K$ fits is used. Many hyperparameter settings could be discarded after less than $K$ resampling iterations, because they already are clearly inferior to high performing settings. However, in practice, the resampling is often performed until the very end, wasting a lot of computational effort. We propose to use a sequential testing procedure to minimize the number of resampling iterations to detect inferior parameter setting. To do so, we first analyze the distribution of resampling errors, we will find out, that a log-normal distribution is promising. Afterwards, we build a sequential testing procedure assuming this distribution. This sequential test procedure is utilized within a random search algorithm. We compare a standard random search with our enhanced sequential random search in some realistic data situation. It can be shown that the sequential random search is able to find comparably good hyperparameter settings, however, the computational time needed to find those settings is roughly halved.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
- North America > United States > New York (0.04)