Investigating Privacy Leakage in Dimensionality Reduction Methods via Reconstruction Attack

Lumbut, Chayadon, Ponnoprat, Donlapark

arXiv.org Artificial Intelligence 

Machine Learning (ML) models have become essential tools for solving complex real-world problems across various domains, including image processing, natural language processing, and business analytics. However, learning from high-dimensional data can be difficult due to the curse of dimensionality and increased computational requirements. To address these issues, dimensionality reduction methods are employed in order to reduce training costs and improve its efficiency. Popular dimensionality reduction methods include principal component analysis, t-SNE [vdMH08], and UMAP [MHM18]. These methods aim to reduce data dimensions while preserving global and local properties of the original data, ensuring that relationships between data points in higher dimensions are still reflected in lower-dimensional representations. The information retained from the original data is crucial for effective data analysis and visualization.