V'yugin, Vladimir
Online Algorithm for Aggregating Experts' Predictions with Unbounded Quadratic Loss
Korotin, Alexander, V'yugin, Vladimir, Burnaev, Evgeny
We consider the problem of online aggregation of experts' predictions with the quadratic loss function. At the beginning of each round t = 1,2,...,T, experts n = 1,...,N provide predictions γ The player and experts n = 1,...,N suffer losses h The goal of the player is to minimize the regret, i.e. the difference between the total loss of the player and the Online regression is a popular special case of the considered problem, i.e. γ The problem of online prediction with experts' advice is considered in the game theory [1] and machine learning [3]. Existing aggregating algorithms provide strategies which guarantee a constant upper bound on the regret but assume that the losses are bounded. However, the algorithm requires knowing B beforehand. In this paper, we propose an algorithm for aggregating experts' predictions which does not require a prior knowledge of the upper bound on the losses.
Prediction of Locally Stationary Data Using Expert Advice
V'yugin, Vladimir, Trunov, Vladimir
Predicting data coming from a "black box" is one of the main tasks of machine learning. In this case, no stochastic assumptions about data source is used. The data comes online as a time series consisting of pairs of the form ("signal", "response"). The data source can be an analog, deterministic (algorithmic) or stochastic process. In this case, we will use simple structural assumptions about the source of the data. In this paper, an approach is proposed in which training is performed on small subsamples of the main sample, forecasts of the constructed predictive models are combined into one common forecast based on the known aggregation methods. The general scheme of the online learning process is as follows. The learning process occurs at discrete times in steps t = 1,2,.... At the next step t, according to the data from the subsample, from the data observed in the past, a local predictive model (expert predictive strategy) is defined to obtain a response to the signal.
Online Aggregation of Probability Forecasts with Confidence
V'yugin, Vladimir, Trunov, Vladimir
The paper presents numerical experiments and some theoretical developments in prediction with expert advice (PEA). One experiment deals with predicting electricity consumption depending on temperature and uses real data. As the pattern of dependence can change with season and time of the day, the domain naturally admits PEA formulation with experts having different ``areas of expertise''. We consider the case where several competing methods produce online predictions in the form of probability distribution functions. The dissimilarity between a probability forecast and an outcome is measured by a loss function (scoring rule). A popular example of scoring rule for continuous outcomes is Continuous Ranked Probability Score (CRPS). In this paper the problem of combining probabilistic forecasts is considered in the PEA framework. We show that CRPS is a mixable loss function and then the time-independent upper bound for the regret of the Vovk aggregating algorithm using CRPS as a loss function can be obtained. Also, we incorporate a ``smooth'' version of the method of specialized experts in this scheme which allows us to combine the probabilistic predictions of the specialized experts with overlapping domains of their competence.
Integral Mixabilty: a Tool for Efficient Online Aggregation of Functional and Probabilistic Forecasts
Korotin, Alexander, V'yugin, Vladimir, Burnaev, Evgeny
In this paper we extend the setting of the online prediction with expert advice to function-valued forecasts. At each step of the online game several experts predict a function and the learner has to efficiently aggregate these functional forecasts into one a single forecast. We adapt basic mixable loss functions to compare functional predictions and prove that these "integral" expansions are also mixable. We call this phenomena integral mixability. As an application, we consider various loss functions for prediction of probability distributions and show that they are mixable by using our main result. The considered loss functions include Continuous ranking probability score (CRPS), Optimal transport costs (OT), Beta-2 and Kullback-Leibler (KL) divergences.
Adaptive Hedging under Delayed Feedback
Korotin, Alexander, V'yugin, Vladimir, Burnaev, Evgeny
The article is devoted to investigating the application of hedging strategies to online expert weight allocation under delayed feedback. As the main result, we develop the General Hedging algorithm $\mathcal{G}$ based on the exponential reweighing of experts' losses. We build the artificial probabilistic framework and use it to prove the adversarial loss bounds for the algorithm $\mathcal{G}$ in the delayed feedback setting. The designed algorithm $\mathcal{G}$ can be applied to both countable and continuous sets of experts. We also show how algorithm $\mathcal{G}$ extends classical Hedge (Multiplicative Weights) and adaptive Fixed Share algorithms to the delayed feedback and derive their regret bounds for the delayed setting by using our main result.
Online Learning with Continuous Ranked Probability Score
V'yugin, Vladimir, Trunov, Vladimir
Probabilistic forecasts in the form of probability distributions over future events have become popular in several fields of statistical science. The dissimilarity between a probability forecast and an outcome is measured by a loss function (scoring rule). Popular example of scoring rule for continuous outcomes is the continuous ranked probability score (CRPS). We consider the case where several competing methods produce online predictions in the form of probability distribution functions. In this paper, the problem of combining probabilistic forecasts is considered in the prediction with expert advice framework. We show that CRPS is a mixable loss function and then the time independent upper bound for the regret of the Vovk's aggregating algorithm using CRPS as a loss function can be obtained. We present the results of numerical experiments illustrating the proposed methods.
Online Aggregation of Unbounded Losses Using Shifting Experts with Confidence
V'yugin, Vladimir, Trunov, Vladimir
We develop the setting of sequential prediction based on shifting experts and on a "smooth" version of the method of specialized experts. To aggregate experts predictions, we use the AdaHedge algorithm, which is a version of the Hedge algorithm with adaptive learning rate, and extend it by the meta-algorithm Fixed Share. Due to this, we combine the advantages of both algorithms: (1) we use the shifting regret which is a more optimal characteristic of the algorithm; (2) regret bounds are valid in the case of signed unbounded losses of the experts. Also, (3) we incorporate in this scheme a "smooth" version of the method of specialized experts which allows us to make more flexible and accurate predictions. All results are obtained in the adversarial setting -- no assumptions are made about the nature of data source. We present results of numerical experiments for short-term forecasting of electricity consumption based on a real data.
Aggregating Strategies for Long-term Forecasting
Korotin, Alexander, V'yugin, Vladimir, Burnaev, Evgeny
The article is devoted to investigating the application of aggregating algorithms to the problem of the long-term forecasting. We examine the classic aggregating algorithms based on the exponential reweighing. For the general Vovk's aggregating algorithm we provide its generalization for the long-term forecasting. For the special basic case of Vovk's algorithm we provide its two modifications for the long-term forecasting. The first one is theoretically close to an optimal algorithm and is based on replication of independent copies. It provides the time-independent regret bound with respect to the best expert in the pool. The second one is not optimal but is more practical and has $O(\sqrt{T})$ regret bound, where $T$ is the length of the game.
Long-Term Sequential Prediction Using Expert Advice
Korotin, Alexander, V'yugin, Vladimir, Burnaev, Evgeny
For the prediction with experts' advice setting, we consider some methods to construct forecasting algorithms that suffer loss not much more than any expert in the pool. In contrast to the standard approach, we investigate the case of long-term forecasting of time series. This approach implies that each expert issues a forecast for a time point ahead (or a time interval), and then the master algorithm combines these forecasts into one aggregated forecast (sequence of forecasts). We introduce two new approaches to aggregating experts' long-term interval predictions. Both are based on Vovk's aggregating algorithm. The first approach applies the method of Mixing Past Posteriors method to the long-term prediction. The second approach is used for the interval forecasting and considers overlapping experts. The upper bounds for regret of these algorithms for adversarial case are obtained. We also present the results of numerical experiments on time series long-term prediction.
On empirical meaning of randomness with respect to a real parameter
V'yugin, Vladimir
We study the empirical meaning of randomness with respect to a family of probability distributions $P_\theta$, where $\theta$ is a real parameter, using algorithmic randomness theory. In the case when for a computable probability distribution $P_\theta$ an effectively strongly consistent estimate exists, we show that the Levin's a priory semicomputable semimeasure of the set of all $P_\theta$-random sequences is positive if and only if the parameter $\theta$ is a computable real number. The different methods for generating ``meaningful'' $P_\theta$-random sequences with noncomputable $\theta$ are discussed.