unknown variance
Outlier-Insensitive Kalman Filtering: Theory and Applications
Truzman, Shunit, Revach, Guy, Shlezinger, Nir, Klein, Itzik
State estimation of dynamical systems from noisy observations is a fundamental task in many applications. It is commonly addressed using the linear Kalman filter (KF), whose performance can significantly degrade in the presence of outliers in the observations, due to the sensitivity of its convex quadratic objective function. To mitigate such behavior, outlier detection algorithms can be applied. In this work, we propose a parameter-free algorithm which mitigates the harmful effect of outliers while requiring only a short iterative process of the standard update step of the KF. To that end, we model each potential outlier as a normal process with unknown variance and apply online estimation through either expectation maximization or alternating maximization algorithms. Simulations and field experiment evaluations demonstrate competitive performance of our method, showcasing its robustness to outliers in filtering scenarios compared to alternative algorithms.
Fused sparsity and robust estimation for linear models with unknown variance
In this paper, we develop a novel approach to the problem of learning sparse representations in the context of fused sparsity and unknown noise level. We propose an algorithm, termed Scaled Fused Dantzig Selector (SFDS), that accomplishes the aforementioned learning task by means of a second-order cone program. A special emphasize is put on the particular instance of fused sparsity corresponding to the learning in presence of outliers. We establish finite sample risk bounds and carry out an experimental evaluation on both synthetic and real data.
Dealing with Unknown Variances in Best-Arm Identification
Jourdan, Marc, Degenne, Rรฉmy, Kaufmann, Emilie
The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known. Despite its practical relevance for many applications, few works studied it for unknown variances. In this paper we introduce and analyze two approaches to deal with unknown variances, either by plugging in the empirical variance or by adapting the transportation costs. In order to calibrate our two stopping rules, we derive new time-uniform concentration inequalities, which are of independent interest. Then, we illustrate the theoretical and empirical performances of our two sampling rule wrappers on Track-and-Stop and on a Top Two algorithm. Moreover, by quantifying the impact on the sample complexity of not knowing the variances, we reveal that it is rather small.
Fused sparsity and robust estimation for linear models with unknown variance
In this paper, we develop a novel approach to the problem of learning sparse representations in the context of fused sparsity and unknown noise level. We propose an algorithm, termed Scaled Fused Dantzig Selector (SFDS), that accomplishes the aforementioned learning task by means of a second-order cone program. A special emphasize is put on the particular instance of fused sparsity corresponding to the learning in presence of outliers. We establish finite sample risk bounds and carry out an experimental evaluation on both synthetic and real data. Papers published at the Neural Information Processing Systems Conference.
A Simple Way to Deal with Cherry-picking
Komiyama, Junpei, Maehara, Takanori
Statistical hypothesis testing serves as statistical evidence for scientific innovation. However, if the reported results are intentionally biased, hypothesis testing no longer controls the rate of false discovery. In particular, we study such selection bias in machine learning models where the reporter is motivated to promote an algorithmic innovation. When the number of possible configurations (e.g., datasets) is large, we show that the reporter can falsely report an innovation even if there is no improvement at all. We propose a `post-reporting' solution to this issue where the bias of the reported results is verified by another set of results. The theoretical findings are supported by experimental results with synthetic and real-world datasets.
Streaming kernel regression with provably adaptive mean, variance, and regularization
Durand, Audrey, Maillard, Odalric-Ambrym, Pineau, Joelle
We consider the problem of streaming kernel regression, when the observations arrive sequentially and the goal is to recover the underlying mean function, assumed to belong to an RKHS. The variance of the noise is not assumed to be known. In this context, we tackle the problem of tuning the regularization parameter adaptively at each time step, while maintaining tight confidence bounds estimates on the value of the mean function at each point. To this end, we first generalize existing results for finite-dimensional linear regression with fixed regularization and known variance to the kernel setup with a regularization parameter allowed to be a measurable function of past observations. Then, using appropriate self-normalized inequalities we build upper and lower bound estimates for the variance, leading to Bersntein-like concentration bounds. The later is used in order to define the adaptive regularization. The bounds resulting from our technique are valid uniformly over all observation points and all time steps, and are compared against the literature with numerical experiments. Finally, the potential of these tools is illustrated by an application to kernelized bandits, where we revisit the Kernel UCB and Kernel Thompson Sampling procedures, and show the benefits of the novel adaptive kernel tuning strategy.
Linear Hypothesis Testing in Dense High-Dimensional Linear Models
We propose a methodology for testing linear hypothesis in high-dimensional linear models. The proposed test does not impose any restriction on the size of the model, i.e. model sparsity or the loading vector representing the hypothesis. Providing asymptotically valid methods for testing general linear functions of the regression parameters in high-dimensions is extremely challenging -- especially without making restrictive or unverifiable assumptions on the number of non-zero elements. We propose to test the moment conditions related to the newly designed restructured regression, where the inputs are transformed and augmented features. These new features incorporate the structure of the null hypothesis directly. The test statistics are constructed in such a way that lack of sparsity in the original model parameter does not present a problem for the theoretical justification of our procedures. We establish asymptotically exact control on Type I error without imposing any sparsity assumptions on model parameter or the vector representing the linear hypothesis. Our method is also shown to achieve certain optimality in detecting deviations from the null hypothesis. We demonstrate the favorable finite-sample performance of the proposed methods, via a number of numerical and a real data example.