Uniform Risk Bounds for Learning with Dependent Data Sequences
–arXiv.org Artificial Intelligence
Statistical learning theory offers probabilistic guarantees on the accuracy of models learned from data. Most of these results assume that the data come from a realization of a sample of independent and identically distributed(i.i.d.) random variables, which allows one to build the theory upon standard concentration arguments. However, this assumption is often unrealistic as dependent data are ubiquitous in real-world applications, such as signal processing, speech recognition, biological sequence annotation (Baldi and Brunak, 2001), dynamical system identification (Ljung, 1987), or even handwritten character recognition where the images collected for training come from a string of letters forming a meaningful text. This paper extends several classical results to sequences of dependent data, such as risk bounds based on the Vapnik-Chervonenkis (VC) dimension or the Rademacher complexity. In particular, we focus on uniform risk bounds that are more suitable for nonconvex loss functions difficult to minimize in practice. As a motivating application, we also consider the consequences of these results in the framework of scenario-based optimization for solving uncertain optimization problems. Here, robust solutions are those that typically satisfy an infinite number of constraints: one for each value of the uncertain parameter of the problem.
arXiv.org Artificial Intelligence
Mar-21-2023
- Country:
- Europe
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report (0.64)
- Technology: