Minimax Optimal Estimation of KL Divergence for Continuous Distributions
Estimating Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains. One simple and effective estimator is based on the k nearest neighbor distances between these samples. In this paper, we analyze the convergence rates of the bias and variance of this estimator. Furthermore, we derive a lower bound of the minimax mean square error and show that kNN method is asymptotically rate optimal. I. INTRODUCTION Kullback-Leibler (KL) divergence has a broad range of applications in information theory, statistics and machine learning. For example, KL divergence can be used in hypothesis testing [1], text classification [2], outlying sequence detection [3], multimedia classification [4], speech recognition [5], etc. In many applications, we hope to know the value of KL divergence, but the distributions are unknown. Therefore, it is important to estimate KL divergence based only on some identical and independently distributed (i.i.d) samples. Such problem has been widely studied [6-13]. The estimation method is different depending on whether the underlying distribution is discrete or continuous. For discrete distributions, an intuitive method is called plugin estimator, which first estimates the probability mass function (PMF) by simply counting the number of occurrences at each possible value and then calculates the KL divergence based on the estimated PMF. However, since it is always possible that the number of occurrences at some locations is zero, this method has infinite bias and variance for arbitrarily large sample size. As a result, it is necessary to design some new estimators, such that both the bias and variance converge to zero. Several methods have been proposed in [11-13].
Feb-26-2020
- Country:
- North America
- United States
- Nevada (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California > Yolo County
- Davis (0.14)
- Canada > British Columbia
- United States
- Europe
- Switzerland > Vaud
- Lausanne (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Switzerland > Vaud
- Asia
- Middle East > Jordan (0.04)
- China > Shanghai
- Shanghai (0.04)
- North America
- Genre:
- Research Report (0.50)