near-optimal density estimation
Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms
Let p be an unknown and arbitrary probability distribution over [0,1) . We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any k and \eps, we give an algorithm that makes \tilde{O}(k/\eps 2) draws from p, runs in \tilde{O}(k/\eps 2) time, and outputs a hypothesis distribution h that is piecewise constant with O(k \log 2(1/\eps)) pieces. With high probability the hypothesis h satisfies \dtv(p,h) \leq C \cdot \opt_k(p) \eps, where \dtv denotes the total variation distance (statistical distance), C is a universal constant, and \opt_k(p) is the smallest total variation distance between p and any k -piecewise constant distribution.
Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms
Chan, Siu On, Diakonikolas, Ilias, Servedio, Rocco A., Sun, Xiaorui
Let $p$ be an unknown and arbitrary probability distribution over $[0,1)$. We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any $k$ and $\eps$, we give an algorithm that makes $\tilde{O}(k/\eps 2)$ draws from $p$, runs in $\tilde{O}(k/\eps 2)$ time, and outputs a hypothesis distribution $h$ that is piecewise constant with $O(k \log 2(1/\eps))$ pieces. With high probability the hypothesis $h$ satisfies $\dtv(p,h) \leq C \cdot \opt_k(p) \eps$, where $\dtv$ denotes the total variation distance (statistical distance), $C$ is a universal constant, and $\opt_k(p)$ is the smallest total variation distance between $p$ and any $k$-piecewise constant distribution.