C-mix: a high dimensional mixture model for censored durations, with applications to genetic data

Bussy, Simon, Guilloux, Agathe, Gaïffas, Stéphane, Jannot, Anne-Sophie

Nov-25-2017–arXiv.org Machine Learning

Predicting subgroups of patients with different prognosis is a key challenge for personalized medicine, see for instance Alizadeh et al. [2000] and Rosenwald et al. [2002] where subgroups of patients with different survival rates are identified based on gene expression data. A substantial number of techniques can be found in the literature to predict the subgroup of a given patient in a classification setting, namely when subgroups are known in advance [Golub et al., 1999, Hastie et al., 2001, Tibshirani et al., 2002]. We consider in the present paper the much more difficult case where subgroups are unknown. In this situation, a first widespread approach consists in first using unsupervised learning techniques applied on the covariates - for instance on the gene expression data [Bhattacharjee et al., 2001, Beer et al., 2002, Sørlie et al., 2001] - to define subsets of patients and then estimating the risks in each of them. The problem of such techniques is that there is no guarantee that the identified subgroups will have different risks.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

Nov-25-2017

arXiv.org PDF

Add feedback

Country:
- Europe (0.46)
- North America > United States (0.28)

Genre:
- Research Report > Experimental Study (0.68)

Industry:
- Health & Medicine
  - Therapeutic Area > Oncology (1.00)
  - Pharmaceuticals & Biotechnology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.93)
  - Machine Learning
    - Statistical Learning (1.00)
    - Performance Analysis > Accuracy (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found