Testing Properties of Multiple Distributions with Few Samples

Nov-17-2019–arXiv.org Machine Learning

Statistical tests are a crucial tool in scientific endeavors to analyze data: We routinely model data to be a set of samples from an unknown distribution, and use statist ical tests to infer or verify the properties of the underlying distribution. While these tests typically oper ate under the assumption that data points are drawn from a single underlying distribution, in applications, usually the dat a is gathered from multiple sources. Furthermore in many situations, it is the case that the datas et contains only a few data points from each source. For example, an online shop may have the purchase his tory of thousands of customers while each customer may shop at the store a small number of times. Altern atively, a medical dataset might record the lifestyle behaviors of patients of a particular disease whi le only having few data points from any specific demographic (such as age). On the other hand, data that comes from multiple sources may r esult in a dataset consisting of a collection of unconnected and unrelated data points.

nullq null 2, probability, structural condition, (16 more...)

arXiv.org Machine Learning

Nov-17-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - District of Columbia > Washington (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.14)
- Europe
  - Czechia > Prague (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.67)
  - Representation & Reasoning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found