VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks

Jul-5-2023–arXiv.org Artificial Intelligence

Vertical Federated Learning (VFL) is a crucial paradigm for training machine learning models on feature-partitioned, distributed data. However, due to privacy restrictions, few public real-world VFL datasets exist for algorithm evaluation, and these represent a limited array of feature distributions. Existing benchmarks often resort to synthetic datasets, derived from arbitrary feature splits from a global set, which only capture a subset of feature distributions, leading to inadequate algorithm performance assessment. This paper addresses these shortcomings by introducing two key factors affecting VFL performance - feature importance and feature correlation - and proposing associated evaluation metrics and dataset splitting methods. Additionally, we introduce a real VFL dataset to address the deficit in image-image VFL scenarios. Our comprehensive evaluation of cutting-edge VFL algorithms provides valuable insights for future research in the field.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

Jul-5-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Greece (0.04)
- North America > United States
  - Virginia (0.04)
  - Ohio (0.04)
  - New York > New York County
    - New York City (0.04)
- Asia
  - Singapore (0.04)
  - China > Guangxi Province
    - Nanning (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.67)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found