Iron: Private Inference on Transformers

Oct-11-2024, 09:09:45 GMT–Neural Information Processing Systems

We initiate the study of private inference on Transformer-based models in the client-server setting, where clients have private inputs and servers hold proprietary models. Our main contribution is to provide several new secure protocols for matrix multiplication and complex non-linear functions like Softmax, GELU activations, and LayerNorm, which are critical components of Transformers. Specifically, we first propose a customized homomorphic encryption-based protocol for matrix multiplication that crucially relies on a novel compact packing technique. This design achieves \sqrt{m} \times less communication ( m is the number of rows of the output matrix) over the most efficient work. Second, we design efficient protocols for three non-linear functions via integrating advanced underlying protocols and specialized optimizations.

private inference, protocol, transformer, (2 more...)

Neural Information Processing Systems

Oct-11-2024, 09:09:45 GMT

Conferences Web Page

Add feedback

Genre:
- Play > Prospect > Container > Reservoir (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.85)