Feature Selection in the Contrastive Analysis Setting

Neural Information Processing Systems 

Contrastive analysis (CA) refers to the exploration of variations uniquely enriched in a dataset as compared to a corresponding dataset generated from sources of variation that are irrelevant to a given task. For example, a biomedical data analyst may wish to find a small set of genes to use as a proxy for variations in genomic data only present among patients with a given disease (target) as opposed to healthy control subjects (background). However, as of yet the problem of feature selection in the CA setting has received little attention from the machine learning community.