Disentangling Interpretable Factors with Supervised Independent Subspace Principal Component Analysis David A. Knowles 2,4,5 Raul Rabadan Program for Mathematical Genomics; 2
–Neural Information Processing Systems
The success of machine learning models relies heavily on effectively representing high-dimensional data. However, ensuring data representations capture humanunderstandable concepts remains difficult, often requiring the incorporation of prior knowledge and decomposition of data into multiple subspaces. Traditional linear methods fall short in modeling more than one space, while more expressive deep learning approaches lack interpretability. Here, we introduce Supervised Independent Subspace Principal Component Analysis (sisPCA), a PCA extension designed for multi-subspace learning. Leveraging the Hilbert-Schmidt Independence Criterion (HSIC), sisPCA incorporates supervision and simultaneously ensures subspace disentanglement. We demonstrate sisPCA's connections with autoencoders and regularized linear regression and showcase its ability to identify and separate hidden data structures through extensive applications, including breast cancer diagnosis from image features, learning aging-associated DNA methylation changes, and single-cell analysis of malaria infection. Our results reveal distinct functional pathways associated with malaria colonization, underscoring the essentiality of explainable representation in high-dimensional data analysis.
Neural Information Processing Systems
Mar-20-2025, 01:46:16 GMT
- Country:
- North America > United States (1.00)
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine > Therapeutic Area
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Oncology (1.00)
- Health & Medicine > Therapeutic Area
- Technology: