Gene Selection and Clustering of Breast Cancer Data
Bhuiyan, Farzana Ahamed (Tennessee Technological University) | Sharif, MD Bulbul (Tennessee Technological University) | Tinker, Paul Joshua (Tennessee Technological University) | Eberle, William (Tennessee Technological University) | Talbert, Douglas A. (Tennessee Technological University) | Ghafoor, Sheikh Khaled (Tennessee Technological University) | Frey, Lewis (Medical University of South Carolina)
In this work, we first attempt to replicate an earlier study on gene selection and clustering, and then we extend this work by applying a different type of hierarchical clustering to dis- cover interesting subsets of genes from breast cancer data. Replication of such studies is a known challenge and an ac- tive area of research in bioinformatics. The work presented in this paper is three-fold. First, we replicate a study conducted at the University of North Carolina to generate an initial set of genes. Second, we apply an approach called Distance Weighted Discrimination to fuse multiple, disparate breast cancer datasets into a single validation set. Third, we per- form hierarchical clustering and k-means clustering on this validation set to discover natural groupings and compare the clusters generated by both methods. While applying the hi- erarchical clustering is part of the reproduction step, we ex- tend the research by trying two different forms of hierarchi- cal clustering. We also apply k-means clustering for the same purpose and compare all three methods using Kaplan-Meier estimation and Cox proportional hazards regression. We dis- cover that among the three methods, k-means clustering gives us the best results.
May-15-2019
- Country:
- North America > United States
- North Carolina (0.24)
- Tennessee > Putnam County
- Cookeville (0.04)
- South Carolina > Charleston County
- Charleston (0.14)
- Asia
- North America > United States
- Genre:
- Research Report
- Experimental Study (0.90)
- New Finding (0.68)
- Research Report
- Industry:
- Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.83)
- Technology: