$k$-Variance: A Clustered Notion of Variance
Solomon, Justin, Greenewald, Kristjan, Nagaraja, Haikady N.
We introduce $k$-variance, a generalization of variance built on the machinery of random bipartite matchings. $K$-variance measures the expected cost of matching two sets of $k$ samples from a distribution to each other, capturing local rather than global information about a measure as $k$ increases; it is easily approximated stochastically using sampling and linear programming. In addition to defining $k$-variance and proving its basic properties, we provide in-depth analysis of this quantity in several key cases, including one-dimensional measures, clustered measures, and measures concentrated on low-dimensional subsets of $\mathbb R^n$. We conclude with experiments and open problems motivated by this new way to summarize distributional shape.
Dec-12-2020
- Country:
- North America > United States
- Massachusetts > Middlesex County
- Cambridge (0.14)
- Ohio > Franklin County
- Columbus (0.04)
- Massachusetts > Middlesex County
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: