Imputation and low-rank estimation with Missing Non At Random data

Sportisse, Aude, Boyer, Claire, Josse, Julie

Jan-7-2019–arXiv.org Machine Learning

Preprint submitted to January 8, 2019 the use of Expectation-Maximization (EM) algorithm [8] which allows to get the maximum likelihood estimators in various incomplete-data problems [21]. The theoretical guarantees of these methods ensuring the correct prediction of missing values or the correct estimation of some parameters of interest are only valid if some assumptions are made on how the data came to be missing. Rubin [31] introduced three types of missing-data mechanisms: (i) the restrictive assumptions of missing completely at random (MCAR) data, (ii) the missing at random (MAR) data, where the missing data may only depend on the observable variables, and (iii) the more general assumption of missing not at random (MNAR) data, i.e. when the unavailability of the data depends on the values of other variables and its own value. A classic example of MNAR data, which is the focus of the paper, is surveys where rich people would be less willing to disclose their income or where people would be less incline to answer sensitive questions on their addictive use. Another example would be the diagnosis of Alzheimer's disease, which can be made using a score obtained by the patient on a specific test. However, when a patient has the disease, he or she has difficulty answering questions and is more likely to abandon the test before it ends.

algorithm, mechanism, softimpute, (15 more...)

arXiv.org Machine Learning

Jan-7-2019

arXiv.org PDF

Add feedback

Country:
- Europe > France > Île-de-France > Paris > Paris (0.04)

Genre:
- Research Report
  - New Finding (0.47)
  - Experimental Study (0.47)

Industry:
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.88)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (0.34)
    - Machine Learning
      - Statistical Learning (1.00)
      - Learning Graphical Models > Directed Networks
        Bayesian Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found