Sparsity-Preserving Differentially Private Training of Large Embedding Models

Apr-25-2026, 22:32:00 GMT–Neural Information Processing Systems

As the use of large embedding models in recommendation systems and language applications increases, concerns over user data privacy have also risen. DP-SGD, a training algorithm that combines differential privacy with stochastic gradient descent, has been the workhorse in protecting user privacy without compromising model accuracy by much. However, applying DP-SGDnaively to embedding models can destroy gradient sparsity, leading to reduced training efficiency. To address this issue, we present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models. Our algorithms achieve substantial reductions (106) in gradient size, while maintaining comparable levels of accuracy, on benchmark real-world datasets.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Apr-25-2026, 22:32:00 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Natural Language (1.00)
    - Representation & Reasoning > Personal Assistant Systems (0.67)
    - Machine Learning
      - Neural Networks (1.00)
      - Statistical Learning > Gradient Descent (0.68)

Duplicate Docs Excel Report

Title
Sparsity-Preserving Differentially Private Training of Large Embedding Models Badih Ghazi Google Research Mountain View, CA

Similar Docs Excel Report more

Title	Similarity	Source
None found