We provide a brief tutorial on the use of concentration inequalities as they apply to system identification of state-space parameters of linear time invariant systems, with a focus on the fully observed setting. We draw upon tools from the theories of large-deviations and self-normalized martingales, and provide both data-dependent and independent bounds on the learning rate. I. INTRODUCTION A key feature in modern reinforcement learning is the ability to provide high-probability guarantees on the finite-data/time behavior of an algorithm acting on a system. The enabling technical tools used in providing such guarantees are concentration of measure results, which should be interpreted as quantitative versions of the strong law of large numbers. This paper provides a brief introduction to such tools, as motivated by the identification of linear-time-invariant (LTI) systems.
Follow the steps below to understand the algorithm - Create duplicate copies of all independent variables. When the number of independent variables in the original data is less than 5, create at least 5 copies using existing variables. Shuffle the values of added duplicate copies to remove their correlations with the target variable. It is called shadow features or permuted copies. Combine the original ones with shuffled copies Run a random forest classifier on the combined dataset and performs a variable importance measure (the default is Mean Decrease Accuracy) to evaluate the importance of each variable where higher means more important.
Aggregate factors (that is, those based on aggregate functions such as SUM, AVERAGE, AND etc.) in probabilistic relational models can compactly represent dependencies among a large number of relational random variables. However, propositional inference on a factor aggregating n k-valued random variables into an r-valued result random variable is O(r k 2n).
We propose to measure statistical dependence between two random variables by the mutual information dimension (MID), and present a scalable parameter-free estimation method for this task. Supported by sound dimension theory, our method gives an effective solution to the problem of detecting interesting relationships of variables in massive data, which is nowadays a heavily studied topic in many scientific disciplines. Different from classical Pearson's correlation coefficient, MID is zero if and only if two random variables are statistically independent and is translation and scaling invariant. We experimentally show superior performance of MID in detecting various types of relationships in the presence of noise data. Moreover, we illustrate that MID can be effectively used for feature selection in regression.
This paper is about the estimation of the maximum expected value of an infinite set of random variables.This estimation problem is relevant in many fields, like the Reinforcement Learning (RL) one.In RL it is well known that, in some stochastic environments, a bias in the estimation error can increase step-by-step the approximation error leading to large overestimates of the true action values. Recently, some approaches have been proposed to reduce such bias in order to get better action-value estimates, but are limited to finite problems.In this paper, we leverage on the recently proposed weighted estimator and on Gaussian process regression to derive a new method that is able to natively handle infinitely many random variables.We show how these techniques can be used to face both continuous state and continuous actions RL problems.To evaluate the effectiveness of the proposed approach we perform empirical comparisons with related approaches.