A Privacy-Friendly Approach to Data Valuation

Jan-19-2025, 20:47:28 GMT–Neural Information Processing Systems

Data valuation, a growing field that aims at quantifying the usefulness of individual data sources for training machine learning (ML) models, faces notable yet often overlooked privacy challenges. We first emphasize the inherent privacy risks of KNN-Shapley, and demonstrate the significant technical challenges in adapting KNN-Shapley to accommodate differential privacy (DP). To overcome these challenges, we introduce TKNN-Shapley, a refined variant of KNN-Shapley that is privacy-friendly, allowing for straightforward modifications to incorporate DP guarantee (DP-TKNN-Shapley). We show that DP-TKNN-Shapley has several advantages and offers a superior privacy-utility tradeoff compared to naively privatized KNN-Shapley. Moreover, even non-private TKNN-Shapley matches KNN-Shapley's performance in discerning data quality.

data valuation, knn-shapley, privacy-friendly approach, (2 more...)

Neural Information Processing Systems

Jan-19-2025, 20:47:28 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.43)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.83)