Using Vocabulary Knowledge in Bayesian Multinomial Estimation

Griffiths, Thomas L., Tenenbaum, Joshua B.

Neural Information Processing Systems 

Recent approaches have used uncertainty over the vocabulary of symbols in a multinomial distribution as a means of accounting for sparsity. We present a Bayesian approach that allows weak prior knowledge, in the form of a small set of approximate candidate vocabularies, to be used to dramatically improve the resulting estimates. We demonstrate these improvements in applications to text compression and estimating distributions over words in newsgroup data.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found