Classification via Minimum Incremental Coding Length (MICL)

Wright, John, Tao, Yangyu, Lin, Zhouchen, Ma, Yi, Shum, Heung-yeung

Neural Information Processing Systems 

We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum numberof additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. Theoretical results provide new insights into relationships among popular classifiers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Minimizing thelossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classification criterionand its kernel and local versions perform competitively against existing classifiers on both synthetic examples and real imagery data such as handwritten digitsand human faces, without requiring domain-specific information.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found