Learning from aggregated data with a maximum entropy model

Gilotte, Alexandre, Yahmed, Ahmed Ben, Rohde, David

Oct-5-2022–arXiv.org Artificial Intelligence

Aggregating a dataset, then injecting some noise, is a simple and common way to release differentially private data.However, aggregated data -- even without noise -- is not an appropriate input for machine learning classifiers.In this work, we show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. The resulting model is a Markov Random Field (MRF), and we detail how to apply, modify and scale a MRF training algorithm to our setting. Finally we present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Oct-5-2022

arXiv.org PDF

Add feedback

Country:
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - France > Île-de-France
    - Paris > Paris (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China > Jiangsu Province
    - Nanjing (0.04)

Genre:
- Research Report > New Finding (0.50)

Industry:
- Information Technology > Security & Privacy (0.88)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Maximum Entropy (0.61)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.95)