CAP: Detecting Unauthorized Data Usage in Generative Models via Prompt Generation

Gallo, Daniela, Liguori, Angelica, Ritacco, Ettore, Caviglione, Luca, Durante, Fabrizio, Manco, Giuseppe

arXiv.org Artificial Intelligence 

The success of modern Machine Learning (ML) systems depends on the quality and quantity of data used for training, which directly influences model performance and generalization capabilities. To this aim, high-quality, diverse, and representative datasets are essential for accurate and unbiased predictions. For instance, insufficient or biased data can lead to poor model performance, inaccuracies, and unintended consequences. Ethical and legal aspects are critical, too.