Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies

Nov-29-2016, 11:15:26 GMT–#artificialintelligence

The goal of genome-wide association studies (GWAS) (e.g. the WTCCC study1) is to examine the relationship between genetic markers such as single-nucleotide polymorphisms (SNPs) and individual traits, which are usually complex diseases or behavioral characteristics. Generally, a large number of statistical tests are performed in parallel, each SNP being individually tested for association2,3,4. The standard approach consists of computing individual, SNP-specific p-values corresponding to a statistical association test and comparing these p-values against some given significance threshold (say t*), meaning that precisely those SNPs with p-values smaller than t*are declared to be associated with the trait4,5,6. We refer to this approach as raw p-value thresholding (RPVT) and review some standard methods for choosing t*for the purpose of controlling multiple type I error rates (in particular, the family-wise error rate (FWER) and the expected number of false rejections (ENFR)) in the Methods Section. According to the GWAS catalog7,8 (last accessed 03-07-2015), the more than 1,400 GWAS published so far have led to the identification of more than 11,000 SNPs associated with about 800 human diseases and anthropometric traits with p-values using t* 1 10 5.

artificial intelligence, machine learning, snp, (14 more...)

#artificialintelligence

Nov-29-2016, 11:15:26 GMT

News Web Page

Add feedback

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.71)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning (0.66)
  - Representation & Reasoning > Scientific Discovery (0.41)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found