Distributed, communication-efficient, and differentially private estimation of KL divergence

Scott, Mary, Biswas, Sayan, Cormode, Graham, Maple, Carsten

Nov-28-2024–arXiv.org Artificial Intelligence

Modern applications in data analysis and machine learning work with highdimensional data to support inferences and provide recommendations [1, 2]. Increasingly, the data to support these tasks comes from individuals who hold their data on personal devices such as smartphones and wearables. In the federated model of computation [3, 4], this data remains on the users' devices, which collaborate and cooperate to build accurate models by performing computations and aggregations on their locally held information (e.g., training and fine-tuning small-scale models). A key primitive needed is the ability to compare the distribution of data held by these clients with a reference distribution. For instance, a platform or a service provider would like to know whether the overall behavior of the data is consistent over time for deploying the best fitting and most relevant model. In cases where the data distribution has changed, it may be necessary to trigger model rebuilding or fine-tuning, whereas if there is no change the current model can continue to be used.

artificial intelligence, kl divergence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Nov-28-2024

arXiv.org PDF

Add feedback

Country:
- Europe > Austria (0.28)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Communications (0.87)
  - Data Science (1.00)
  - Security & Privacy (1.00)