An image representation based convolutional network for DNA classification

Yin, Bojian, Balvert, Marleen, Zambrano, Davide, Schönhuth, Alexander, Bohte, Sander

arXiv.org Machine Learning 

DNA is perceived as a sequence over the letters {A, C, G, T }, the alphabet of nucleotides. This sequence constitutes the code that acts as a blueprint for all processes taking place in a cell. But beyond merely reflecting primary sequence, DNA is a molecule, which implies that DNA assumes spatial structure and shape. The spatial organization of DNA is achieved by integrating ("recruiting") other molecules, the histone proteins, that help to assume the correct spatial configuration. The combination of DNA and helper molecules is called chromatin; the spatial configuration of the chromatin, finally, defines the functional properties of local areas of the DNA [de Graaf and van Steensel, 2013]. Chromatin can assume several function-defining epigenetic states, where states vary along the genome [Ernst et al., 2011]. The key determinant for spatial configuration is the underlying primary DNA sequence: sequential patterns are responsible for recruiting histone proteins and their chemical modifications, which in turn give rise to or even define the chromatin states. The exact configuration of the chromatin and its interplay with the underlying raw DNA sequence are under active research.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found