Scalable Fingerprinting of Large Language Models
Nasery, Anshul, Hayase, Jonathan, Brooks, Creston, Sheng, Peiyao, Tyagi, Himanshu, Viswanath, Pramod, Oh, Sewoong
–arXiv.org Artificial Intelligence
In typical use-cases, existing methods focus on Harmlessness and Persistence (Xu et al., 2024a; Russinovich & Model fingerprinting has emerged as a powerful Salem, 2024) of fingerprints. Fingerprinting is Harmless if tool for model owners to identify their shared the utility of the fingerprinted model does not degrade from model given API access. However, to lower false the base model, and it is Persistent if performing supervised discovery rate, fight fingerprint leakage, and defend fine-tuning (SFT) on the fingerprinted model with other data against coalitions of model users attempting does not make model forget the fingerprints (Jagielski et al., to bypass detection, we argue that scalability is 2023; Chen et al., 2024). While these properties are important, critical, i.e., scaling up the number of fingerprints we argue that there is another important criterion for one can embed into a model. Hence, we pose a good fingerprinting scheme not captured by prior work: scalability as a crucial requirement for fingerprinting Scalability. A fingerprinting scheme is scalable if many schemes. We experiment with fingerprint design fingerprints can be added without hurting the performance at a scale significantly larger than previously of the model.
arXiv.org Artificial Intelligence
Feb-11-2025