TDDBench: A Benchmark for Training data detection

Nov-5-2024–arXiv.org Artificial Intelligence

Metric-based methods rely on the analysis of certain statistical properties of a target model's output, such as confidence scores, prediction probabilities, or loss values, to distinguish between training data and non-training data. Specifically, Metric-loss (Yeom et al., 2018) is the first metricbased detection method, predicting that data points with a loss below a certain threshold are part of the training data for the target model. Similarly, other works have proposed using the maximum confidence of the target model output (denoted as Metric-conf (Song et al., 2019)), the correctness of the target model output (denoted as Metric-corr (Leino & Fredrikson, 2020)), the entropy of prediction probability distributions (denoted as Metric-ent (Shokri et al., 2017; Song & Mittal, 2021)), and modified entropy of the prediction (denoted as Metric-ment (Song & Mittal, 2021)). Learning-based methods involve training an auxiliary classifier (meta-classifier) to distinguish between training data and non-training data. In the literature, neural networks (NNs) are often employed as the auxiliary classifier. The primary differences between learning-based TDD methods lie in the choice of input features for the auxiliary classifier. Earlier work (Shokri et al., 2017) has proposed using the original prediction vector of the target model (denoted as Learn-original). Other works have suggested using the top-3 prediction confidences (denoted as Learn-top3 (Salem et al., 2019)), the sorted prediction vector (denoted as Learn-sorted (Salem et al., 2019)), the true label of the example combined with the prediction vector (denoted as Learn-label

artificial intelligence, machine learning, target model, (16 more...)

arXiv.org Artificial Intelligence

Nov-5-2024

arXiv.org PDF

Add feedback

Country:
- South America > Paraguay
  - Asunción > Asunción (0.04)
- North America > United States
  - Texas (0.04)
- Asia > China
  - Hong Kong (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Information Technology > Security & Privacy (1.00)
- Education (0.92)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Performance Analysis > Accuracy (0.93)
  - Neural Networks (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found