Near-Optimal Coresets of Kernel Density Estimates
Phillips, Jeff M., Tai, Wai Ming
We construct near-optimal coresets for kernel density estimate for points in $\mathbb{R^d}$ when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size $O(\sqrt{d\log (1/\epsilon)}/\epsilon)$, and we show a near-matching lower bound of size $\Omega(\sqrt{d}/\epsilon)$. The upper bound is a polynomial in $1/\epsilon$ improvement when $d \in [3,1/\epsilon^2)$ (for all kernels except the Gaussian kernel which had a previous upper bound of $O((1/\epsilon) \log^d (1/\epsilon))$) and the lower bound is the first known lower bound to depend on $d$ for this problem. Moreover, the upper bound restriction that the kernel is positive definite is significant in that it applies to a wide-variety of kernels, specifically those most important for machine learning. This includes kernels for information distances and the sinc kernel which can be negative.
Mar-14-2018
- Country:
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- Genre:
- Research Report (0.64)
- Technology: