$k$-Variance: A Clustered Notion of Variance

Solomon, Justin, Greenewald, Kristjan, Nagaraja, Haikady N.

Dec-12-2020–arXiv.org Machine Learning

We introduce $k$-variance, a generalization of variance built on the machinery of random bipartite matchings. $K$-variance measures the expected cost of matching two sets of $k$ samples from a distribution to each other, capturing local rather than global information about a measure as $k$ increases; it is easily approximated stochastically using sampling and linear programming. In addition to defining $k$-variance and proving its basic properties, we provide in-depth analysis of this quantity in several key cases, including one-dimensional measures, clustered measures, and measures concentrated on low-dimensional subsets of $\mathbb R^n$. We conclude with experiments and open problems motivated by this new way to summarize distributional shape.

empirical measure, experiment, variance, (13 more...)

arXiv.org Machine Learning

Dec-12-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Ohio > Franklin County
    - Columbus (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.14)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.68)
  - Representation & Reasoning > Optimization (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found