Towards Data Valuation via Asymmetric Data Shapley
Zheng, Xi, Chang, Xiangyu, Jia, Ruoxi, Tan, Yong
–arXiv.org Artificial Intelligence
Data valuation, which measures the contribution of individual data source on machine learning (ML) model performance, plays a crucial role in improving algorithmic transparency and creating incentive mechanisms for data sharing and monetization (Liu et al., 2023). Its importance is particularly evident in sectors like healthcare and finance, where explainable ML is increasingly being adopted for high-stake decision-making (Sahoh and Choksuriwong, 2023). The recent rise of data marketplaces further highlights the need for accurate data valuation (Ghorbani and Zou, 2019; Jia et al., 2019a). By integrating diverse data sources, these marketplaces enhance ML tasks and unlock significant business values (Agarwal et al., 2019). Fair compensation for data creators based on the value of their data is crucial in such contexts, making the equitable valuation of data a key issue (Altman, 2023). Data Shapley has recently gained widespread recognition for quantifying the contribution of individual data points to ML models (Ghorbani and Zou, 2019; Jia et al., 2019b). It is uniquely defined by four axioms (see Axiom 2.1-2.4 in Section 2).
arXiv.org Artificial Intelligence
Nov-20-2024
- Country:
- Europe (0.28)
- North America > United States
- Minnesota > Hennepin County > Minneapolis (0.14)
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Health & Medicine (0.48)
- Technology: