Representer Point Selection for Explaining Regularized High-dimensional Models

Tsai, Che-Ping, Zhang, Jiong, Chien, Eli, Yu, Hsiang-Fu, Hsieh, Cho-Jui, Ravikumar, Pradeep

Jun-30-2023–arXiv.org Artificial Intelligence

We introduce a novel class of sample-based explanations we term high-dimensional representers, that can be used to explain the predictions of a regularized high-dimensional model in terms of importance weights for each of the training samples. Our workhorse is a novel representer theorem for general regularized high-dimensional models, which decomposes the model prediction in terms of contributions from each of the training samples: with positive (negative) values corresponding to positive (negative) impact training samples to the model's prediction. We derive consequences for the canonical instances of $\ell_1$ regularized sparse models, and nuclear norm regularized low-rank models. As a case study, we further investigate the application of low-rank models in the context of collaborative filtering, where we instantiate high-dimensional representers for specific popular classes of models. Finally, we study the empirical performance of our proposed methods on three real-world binary classification datasets and two recommender system datasets. We also showcase the utility of high-dimensional representers in explaining model recommendations.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jun-30-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Republic of Türkiye > Batman Province > Batman (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- North America > United States
  - California
    - Los Angeles County > Los Angeles (0.28)
    - Santa Clara County > Palo Alto (0.04)
  - Illinois > Champaign County
    - Champaign (0.04)
    - Urbana (0.04)
  - New York (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.14)

Genre:
- Research Report (1.00)

Industry:
- Leisure & Entertainment (1.00)
- Media > Film (0.93)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Statistical Learning (1.00)
    - Natural Language (1.00)
    - Representation & Reasoning > Personal Assistant Systems (1.00)
  - Data Science > Data Mining (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found