Drucker, Nir
Efficient Pruning for Machine Learning Under Homomorphic Encryption
Aharoni, Ehud, Baruch, Moran, Bose, Pradip, Buyuktosunoglu, Alper, Drucker, Nir, Pal, Subhankar, Pelleg, Tomer, Sarpatwar, Kanthi, Shaul, Hayim, Soceanu, Omri, Vaculin, Roman
Privacy-preserving machine learning (PPML) solutions are gaining widespread popularity. Among these, many rely on homomorphic encryption (HE) that offers confidentiality of the model and the data, but at the cost of large latency and memory requirements. Pruning neural network (NN) parameters improves latency and memory in plaintext ML but has little impact if directly applied to HE-based PPML. We introduce a framework called HE-PEx that comprises new pruning methods, on top of a packing technique called tile tensors, for reducing the latency and memory of PPML inference. HE-PEx uses permutations to prune additional ciphertexts, and expansion to recover inference loss. We demonstrate the effectiveness of our methods for pruning fully-connected and convolutional layers in NNs on PPML tasks, namely, image compression, denoising, and classification, with autoencoders, multilayer perceptrons (MLPs) and convolutional neural networks (CNNs). We implement and deploy our networks atop a framework called HElayers, which shows a 10-35% improvement in inference speed and a 17-35% decrease in memory requirement over the unpruned network, corresponding to 33-65% fewer ciphertexts, within a 2.5% degradation in inference accuracy over the unpruned network. Compared to the state-of-the-art pruning technique for PPML, our techniques generate networks with 70% fewer ciphertexts, on average, for the same degradation limit.
Power-Softmax: Towards Secure LLM Inference over Encrypted Data
Zimerman, Itamar, Adir, Allon, Aharoni, Ehud, Avitan, Matan, Baruch, Moran, Drucker, Nir, Lerner, Jenny, Masalha, Ramy, Meiri, Reut, Soceanu, Omri
Modern cryptographic methods for implementing privacy-preserving LLMs such as Homomorphic Encryption (HE) require the LLMs to have a polynomial form. Forming such a representation is challenging because Transformers include non-polynomial components, such as Softmax and layer normalization. Previous approaches have either directly approximated pre-trained models with large-degree polynomials, which are less efficient over HE, or replaced non-polynomial components with easier-to-approximate primitives before training, e.g., Softmax with pointwise attention. The latter approach might introduce scalability challenges. We present a new HE-friendly variant of self-attention that offers a stable form for training and is easy to approximate with polynomials for secure inference. Our work introduces the first polynomial LLMs with 32 layers and over a billion parameters, exceeding the size of previous models by more than tenfold. The resulting models demonstrate reasoning and in-context learning (ICL) capabilities comparable to standard transformers of the same size, representing a breakthrough in the field. Finally, we provide a detailed latency breakdown for each computation over encrypted data, paving the way for further optimization, and explore the differences in inductive bias between transformers relying on our HE-friendly variant and standard transformers. Our code is attached as a supplement.
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Zimerman, Itamar, Baruch, Moran, Drucker, Nir, Ezov, Gilad, Soceanu, Omri, Wolf, Lior
Designing privacy-preserving deep learning models is a major challenge within the deep learning community. Homomorphic Encryption (HE) has emerged as one of the most promising approaches in this realm, enabling the decoupling of knowledge between the model owner and the data owner. Despite extensive research and application of this technology, primarily in convolutional neural networks, incorporating HE into transformer models has been challenging because of the difficulties in converting these models into a polynomial form. We break new ground by introducing the first polynomial transformer, providing the first demonstration of secure inference over HE with transformers. This includes a transformer architecture tailored for HE, alongside a novel method for converting operators to their polynomial equivalent. This innovation enables us to perform secure inference on LMs with WikiText-103. It also allows us to perform image classification with CIFAR-100 and Tiny-ImageNet. Our models yield results comparable to traditional methods, bridging the performance gap with transformers of similar scale and underscoring the viability of HE for state-of-the-art applications. Finally, we assess the stability of our models and conduct a series of ablations to quantify the contribution of each model component.
Privacy-Preserving Federated Learning over Vertically and Horizontally Partitioned Data for Financial Anomaly Detection
Kadhe, Swanand Ravindra, Ludwig, Heiko, Baracaldo, Nathalie, King, Alan, Zhou, Yi, Houck, Keith, Rawat, Ambrish, Purcell, Mark, Holohan, Naoise, Takeuchi, Mikio, Kawahara, Ryo, Drucker, Nir, Shaul, Hayim, Kushnir, Eyal, Soceanu, Omri
The effective detection of evidence of financial anomalies requires collaboration among multiple entities who own a diverse set of data, such as a payment network system (PNS) and its partner banks. Trust among these financial institutions is limited by regulation and competition. Federated learning (FL) enables entities to collaboratively train a model when data is either vertically or horizontally partitioned across the entities. However, in real-world financial anomaly detection scenarios, the data is partitioned both vertically and horizontally and hence it is not possible to use existing FL approaches in a plug-and-play manner. Our novel solution, PV4FAD, combines fully homomorphic encryption (HE), secure multi-party computation (SMPC), differential privacy (DP), and randomization techniques to balance privacy and accuracy during training and to prevent inference threats at model deployment time. Our solution provides input privacy through HE and SMPC, and output privacy against inference time attacks through DP. Specifically, we show that, in the honest-but-curious threat model, banks do not learn any sensitive features about PNS transactions, and the PNS does not learn any information about the banks' dataset but only learns prediction labels. We also develop and analyze a DP mechanism to protect output privacy during inference. Our solution generates high-utility models by significantly reducing the per-bank noise level while satisfying distributed DP. To ensure high accuracy, our approach produces an ensemble model, in particular, a random forest. This enables us to take advantage of the well-known properties of ensembles to reduce variance and increase accuracy. Our solution won second prize in the first phase of the U.S. Privacy Enhancing Technologies (PETs) Prize Challenge.
Efficient Skip Connections Realization for Secure Inference on Encrypted Data
Drucker, Nir, Zimerman, Itamar
Homomorphic Encryption (HE) is a cryptographic tool that allows performing computation under encryption, which is used by many privacy-preserving machine learning solutions, for example, to perform secure classification. Modern deep learning applications yield good performance for example in image processing tasks benchmarks by including many skip connections. The latter appears to be very costly when attempting to execute model inference under HE. In this paper, we show that by replacing (mid-term) skip connections with (short-term) Dirac parameterization and (long-term) shared-source skip connection we were able to reduce the skip connections burden for HE-based solutions, achieving x1.3 computing power improvement for the same accuracy.
Training Large Scale Polynomial CNNs for E2E Inference over Homomorphic Encryption
Baruch, Moran, Drucker, Nir, Ezov, Gilad, Goldberg, Yoav, Kushnir, Eyal, Lerner, Jenny, Soceanu, Omri, Zimerman, Itamar
Training large-scale CNNs that during inference can be run under Homomorphic Encryption (HE) is challenging due to the need to use only polynomial operations. This limits HE-based solutions adoption. We address this challenge and pioneer in providing a novel training method for large polynomial CNNs such as ResNet-152 and ConvNeXt models, and achieve promising accuracy on encrypted samples on large-scale dataset such as ImageNet. Additionally, we provide optimization insights regarding activation functions and skip-connection latency impacts, enhancing HE-based evaluation efficiency. Finally, to demonstrate the robustness of our method, we provide a polynomial adaptation of the CLIP model for secure zero-shot prediction, unlocking unprecedented capabilities at the intersection of HE and transfer learning.
HeLayers: A Tile Tensors Framework for Large Neural Networks on Encrypted Data
Aharoni, Ehud, Adir, Allon, Baruch, Moran, Drucker, Nir, Ezov, Gilad, Farkash, Ariel, Greenberg, Lev, Masalha, Ramy, Moshkowich, Guy, Murik, Dov, Shaul, Hayim, Soceanu, Omri
Privacy-preserving solutions enable companies to offload confidential data to third-party services while fulfilling their government regulations. To accomplish this, they leverage various cryptographic techniques such as Homomorphic Encryption (HE), which allows performing computation on encrypted data. Most HE schemes work in a SIMD fashion, and the data packing method can dramatically affect the running time and memory costs. Finding a packing method that leads to an optimal performant implementation is a hard task. We present a simple and intuitive framework that abstracts the packing decision for the user. We explain its underlying data structures and optimizer, and propose a novel algorithm for performing 2D convolution operations. We used this framework to implement an HE-friendly version of AlexNet, which runs in three minutes, several orders of magnitude faster than other state-of-the-art solutions that only use HE.
A methodology for training homomorphicencryption friendly neural networks
Baruch, Moran, Drucker, Nir, Greenberg, Lev, Moshkowich, Guy
Privacy-preserving deep neural network (DNN) inference is a necessity in different regulated industries such as healthcare, finance and retail. Recently, homomorphic encryption (HE) has been used as a method to enable analytics while addressing privacy concerns. HE enables secure predictions over encrypted data. However, there are several challenges related to the use of HE, including DNN size limitations and the lack of support for some operation types. Most notably, the commonly used ReLU activation is not supported under some HE schemes. We propose a structured methodology to replace ReLU with a quadratic polynomial activation. To address the accuracy degradation issue, we use a pre-trained model that trains another HE-friendly model, using techniques such as trainable activation functions and knowledge distillation. We demonstrate our methodology on the AlexNet architecture, using the chest X-Ray and CT datasets for COVID-19 detection. Experiments using our approach reduced the gap between the F1 score and accuracy of the models trained with ReLU and the HE-friendly model to within a mere 0.32-5.3 percent degradation. We also demonstrate our methodology using the SqueezeNet architecture, for which we observed 7 percent accuracy and F1 improvements over training similar networks with other HE-friendly training methods.