Optimal Confidence Regions for the Multinomial Parameter

Malloy, Matthew L., Tripathy, Ardhendu, Nowak, Robert D.

arXiv.org Machine Learning 

A confidence region for p is a subset of the k -simplex that depends on null p, and includes the unknown true distribution p with a specified confidence. More precisely, C δ( null p) k is a confidence region at confidence level 1 δ if P p (p null C δ( null p)) δ (1) holds for all p k, where k denotes the k -simplex, and P p(·) is the multinomial probability measure under p . Construction of tight confidence regions for categorical distributions is a long standing problem dating back nearly a hundred years [1]. The goal is to construct regions that are as small as possible, but still satisfy (1). Broadly speaking, approaches for constructing confidence regions can be classified into i) approximate methods that fail to guarantee coverage (i.e, (1) fails to hold for all p) and ii) methods that succeed in guaranteeing coverage, but have excessive volume - for example, approaches based on Sanov or Hoeffding-Bernstein type inequalities. Recent approaches based on combinations of methods [2] have shown improvement through numerical experiment, but do not provide theoretical guarantees on the volume of the confidence regions. To the best of our knowledge, construction of confidence regions for the multinomial parameter that have minimal volume and guarantee coverage is an open problem. One construction that has shown promise empirically is the level-set approach of [3]. The level set confidence regions are similar to'exact' and Clopper-Pearson 1 regions [1] as they involve inverting tail Authors are with the Electrical & Computer Engineering Department at University of Wisconsin-Madison.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found