Kernels for gene regulatory regions

Vert, Jean-philippe, Thurman, Robert, Noble, William S.

Neural Information Processing Systems 

We describe a hierarchy of motif-based kernels for multiple alignments of biological sequences, particularly suitable to process regulatory regions ofgenes. The kernels incorporate progressively more information, with the most complex kernel accounting for a multiple alignment of orthologous regions, the phylogenetic tree relating the species, and the prior knowledge that relevant sequence patterns occur in conserved motif blocks.These kernels can be used in the presence of a library of known transcription factor binding sites, or de novo by iterating over all k-mers of a given length. In the latter mode, a discriminative classifier builtfrom such a kernel not only recognizes a given class of promoter regions,but as a side effect simultaneously identifies a collection of relevant, discriminative sequence motifs. We demonstrate the utility of the motif-based multiple alignment kernels by using a collection ofaligned promoter regions from five yeast species to recognize classes of cell-cycle regulated genes.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found