On Socially Fair Low-Rank Approximation and Column Subset Selection

Song, Zhao, Vakilian, Ali, Woodruff, David P., Zhou, Samson

Dec-8-2024–arXiv.org Machine Learning

Low-rank approximation and column subset selection are two fundamental and related problems that are applied across a wealth of machine learning applications. In this paper, we study the question of socially fair low-rank approximation and socially fair column subset selection, where the goal is to minimize the loss over all sub-populations of the data. We show that surprisingly, even constant-factor approximation to fair low-rank approximation requires exponential time under certain standard complexity hypotheses. On the positive side, we give an algorithm for fair low-rank approximation that, for a constant number of groups and constant-factor accuracy, runs in $2^{\text{poly}(k)}$ time rather than the na\"{i}ve $n^{\text{poly}(k)}$, which is a substantial improvement when the dataset has a large number $n$ of observations. We then show that there exist bicriteria approximation algorithms for fair low-rank approximation and fair column subset selection that run in polynomial time.

approximation, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

Dec-8-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Texas (0.04)
  - California (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - Illinois > Cook County
    - Chicago (0.04)

Genre:
- Research Report (0.82)

Industry:
- Law (0.45)

Technology:
- Information Technology
  - Data Science > Data Mining (0.93)
  - Artificial Intelligence
    - Machine Learning > Statistical Learning (1.00)
    - Representation & Reasoning > Optimization (0.93)