Active Distribution Learning from Indirect Samples

Gupta, Samarth, Joshi, Gauri, Yağan, Osman

Aug-15-2018–arXiv.org Machine Learning

This paper studies the problem of {\em learning} the probability distribution $P_X$ of a discrete random variable $X$ using indirect and sequential samples. At each time step, we choose one of the possible $K$ functions, $g_1, \ldots, g_K$ and observe the corresponding sample $g_i(X)$. The goal is to estimate the probability distribution of $X$ by using a minimum number of such sequential samples. This problem has several real-world applications including inference under non-precise information and privacy-preserving statistical estimation. We establish necessary and sufficient conditions on the functions $g_1, \ldots, g_K$ under which asymptotically consistent estimation is possible. We also derive lower bounds on the estimation error as a function of total samples and show that it is order-wise achievable. Leveraging these results, we propose an iterative algorithm that i) chooses the function to observe at each step based on past observations; and ii) combines the obtained samples to estimate $p_X$. The performance of this algorithm is investigated numerically under various scenarios, and shown to outperform baseline approaches.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

Aug-15-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.14)
  - New Jersey > Mercer County
    - Princeton (0.04)
- Asia > India
  - West Bengal > Kolkata (0.04)

Genre:
- Research Report (0.70)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.67)
  - Artificial Intelligence > Machine Learning
    - Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found