Adversarial Laws of Large Numbers and Optimal Regret in Online Classification

Alon, Noga, Ben-Eliezer, Omri, Dagan, Yuval, Moran, Shay, Naor, Moni, Yogev, Eylon

arXiv.org Machine Learning 

Thus, one of the most fundamental tasks in statistics is to provide bounds on the sample size which is sufficient to soundly represent the population, and probabilistic tools are used to derive such guarantees, under a variety of assumptions. Virtually all of these guarantees are based on classical probabilistic models which assume that the target population is fixed in advance and does not depend on the sample collected throughout the process . Such an assumption, that the setting is offline (or oblivious or static), is however not always realistic. In this work we explore an abstract framework which removes this assumption, and prove that natural and efficient sampling processes produce samples which soundly represent the target population. Situations where the sampling process explicitly or implicitly affects the target population are abundant in modern data analysis. Consider, for instance, navigation apps that optimize traffic by routing drivers to less congested routes: such apps collect statistics from drivers to estimate the traffic-load on the routes, and use these estimates to guide their users through faster routes.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found