Scalable Fingerprinting of Large Language Models

Nasery, Anshul, Hayase, Jonathan, Brooks, Creston, Sheng, Peiyao, Tyagi, Himanshu, Viswanath, Pramod, Oh, Sewoong

Feb-11-2025–arXiv.org Artificial Intelligence

In typical use-cases, existing methods focus on Harmlessness and Persistence (Xu et al., 2024a; Russinovich & Model fingerprinting has emerged as a powerful Salem, 2024) of fingerprints. Fingerprinting is Harmless if tool for model owners to identify their shared the utility of the fingerprinted model does not degrade from model given API access. However, to lower false the base model, and it is Persistent if performing supervised discovery rate, fight fingerprint leakage, and defend fine-tuning (SFT) on the fingerprinted model with other data against coalitions of model users attempting does not make model forget the fingerprints (Jagielski et al., to bypass detection, we argue that scalability is 2023; Chen et al., 2024). While these properties are important, critical, i.e., scaling up the number of fingerprints we argue that there is another important criterion for one can embed into a model. Hence, we pose a good fingerprinting scheme not captured by prior work: scalability as a crucial requirement for fingerprinting Scalability. A fingerprinting scheme is scalable if many schemes. We experiment with fingerprint design fingerprints can be added without hurting the performance at a scale significantly larger than previously of the model.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Feb-11-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - Japan (0.14)
  - Thailand (0.14)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.69)
    - Performance Analysis > Accuracy (0.69)
  - Natural Language > Large Language Model (1.00)