Privacy-Preserving Collaborative Learning through Feature Extraction
Sarmadi, Alireza, Fu, Hao, Krishnamurthy, Prashanth, Garg, Siddharth, Khorrami, Farshad
–arXiv.org Artificial Intelligence
We propose a framework in which multiple entities collaborate to build a machine learning model while preserving privacy of their data. The approach utilizes feature embeddings from shared/per-entity feature extractors transforming data into a feature space for cooperation between entities. We propose two specific methods and compare them with a baseline method. In Shared Feature Extractor (SFE) Learning, the entities use a shared feature extractor to compute feature embeddings of samples. In Locally Trained Feature Extractor (LTFE) Learning, each entity uses a separate feature extractor and models are trained using concatenated features from all entities. As a baseline, in Cooperatively Trained Feature Extractor (CTFE) Learning, the entities train models by sharing raw data. Secure multi-party algorithms are utilized to train models without revealing data or features in plain text. We investigate the trade-offs among SFE, LTFE, and CTFE in regard to performance, privacy leakage (using an off-the-shelf membership inference attack), and computational cost. LTFE provides the most privacy, followed by SFE, and then CTFE. Computational cost is lowest for SFE and the relative speed of CTFE and LTFE depends on network architecture. CTFE and LTFE provide the best accuracy. We use MNIST, a synthetic dataset, and a credit card fraud detection dataset for evaluations.
arXiv.org Artificial Intelligence
Dec-12-2022
- Country:
- North America
- Canada (0.04)
- Barbados > Christ Church (0.04)
- United States
- Ohio (0.04)
- District of Columbia > Washington (0.04)
- Texas > Dallas County
- Dallas (0.04)
- New York
- Kings County > New York City (0.04)
- New York County > New York City (0.04)
- Maryland > Montgomery County
- Bethesda (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Santa Barbara County > Santa Barbara (0.04)
- Los Angeles County > Long Beach (0.04)
- Santa Clara County
- Santa Clara (0.04)
- San Jose (0.04)
- Europe
- Austria > Vienna (0.14)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Italy
- France > Île-de-France
- Asia
- South Korea > Seoul
- Seoul (0.04)
- Middle East
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Iran > Tehran Province
- Tehran (0.04)
- Republic of Türkiye > Istanbul Province
- India > Tamil Nadu
- Chennai (0.04)
- China
- Guangdong Province > Shenzhen (0.04)
- Anhui Province > Hefei (0.04)
- South Korea > Seoul
- North America
- Genre:
- Overview (0.46)
- Research Report (0.40)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology:
- Information Technology
- Data Science > Data Mining (1.00)
- Communications (1.00)
- Artificial Intelligence > Machine Learning
- Neural Networks (1.00)
- Statistical Learning (0.68)
- Performance Analysis > Accuracy (0.46)
- Information Technology