Missing Value Knockoffs

Feb-25-2022–arXiv.org Machine Learning

Coping with increasing number of variables, optimizing predictive performance, and selecting among candidate scientific hypothesis are all valid reasons for using a variable selection algorithm. Another reality of today's datasets are missing values. Although there are existing methods for handling the missing values if applied directly, they can interfere with the assumptions of variable selection algorithms. In this work, we will discuss how model-x knockoffs (Candes et al. 2017), a new approach in principled variable selection, can be applied to datasets that contain missing values. By principled variable selection we refer to algorithms that aims to identify the Markov Blanket (MB) of a response variable (Tsamardinos and Aliferis 2003) while providing a control of the false selections. Identifying the MB is by definition optimal as the MB refers to the smallest subset of variables that is sufficient to describe the conditional distribution of the response variable. Controlling the false selections refers to limiting the variables that are selected due to random chance and is especially important in applications where a selected variable corresponds to a scientific discovery. Model-x knockoffs provides a framework for repurposing existing statistical/machine learning feature scorers for MB discovery. When the assumptions of the model-x framework holds, the expected fraction of selections that are conditionally pairwise independent with the response variable is controlled.

denote, imputation, knockoff, (14 more...)

arXiv.org Machine Learning

Feb-25-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > Rensselaer County
    - Troy (0.04)
  - California > San Francisco County
    - San Francisco (0.14)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.04)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found