A fast numerical method for max-convolution and the application to efficient max-product inference in Bayesian networks

Serang, Oliver

arXiv.org Machine Learning 

In many fields it is common to have access to information about sums of random variables and to desire information about those variables themselves. In mass spectrometry, when two (or more) analytes with similar mass-to-charge 1 are measured, the intensity of the resulting peak is a function of the sum of abundances of those analytes (this problem occurs not only in the mass spectrometry of small molecules, but also in measuring isotope measurement in elemental and nuclear mass spectrometry). In transcriptomics, the abundance of a particular non-unique read (i.e., an RNA sequence that maps to multiple locations in the transcriptome or genome) provides information about the sum of the abundances of all transcripts that contain the read (each transcript weighted by how many copies of the read it carries). Proteomics has its own version of non-unique reads, shared peptides which can be found in multiple proteins (not only are shared peptides the principal source of difficulty in protein inference [14, 17, 18], they are also responsible for the difficulty evaluating putatative sets of discovered proteins [16, 19]). In population genetics, the prior knowledge about population structure can suggest an expected number of individuals with a particular genotype, which in turn yields probabilistic information about the individuals whose aggregate genotypes are expected to produce that sum (inference is particularly pronounced in polyploids, which increase the dimensionality of the problem [15]).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found