Tight Bounds on Profile Redundancy and Distinguishability
Acharya, Jayadev, Das, Hirakendu, Orlitsky, Alon
–Neural Information Processing Systems
The minimax KL-divergence of any distribution from all distributions in a collection P has several practical implications. In compression, it is called redundancy and represents the least additional number of bits over the entropy needed to encode the output of any distribution in P. In online estimation andlearning, it is the lowest expected log-loss regret when guessing a sequence of random values generated by a distribution in P. In hypothesis testing, it upper bounds the largest number of distinguishable distributions in P. Motivated by problems ranging from population estimation to text classification and speech recognition, several machine-learning and information-theory researchers have recently considered label-invariant observations and properties induced by i.i.d.
Neural Information Processing Systems
Dec-31-2012