mec score
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > California (0.04)
- Asia > Singapore (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction
The pseudocode for the NeurHap-refine is as follows: Algorithm 1: The Local Refinement Algorithm NeurHap-refine. Two categories of datasets are used in the paper, Polyploid species and Viral Quasispecies . BW A-MEM [Li, 2013] is used to align reads to the reference genome. The detailed command is (take the 15-strain ZIKV as an example): $ ./bwa Vikalo, 2020a,b] to derive the SNP matrix from the above alignment to ensure a fair comparison.
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > California (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction
Xue, Hansheng, Rajan, Vaibhav, Lin, Yu
Understanding genetic variation, e.g., through mutations, in organisms is crucial to unravel their effects on the environment and human health. A fundamental characterization can be obtained by solving the haplotype assembly problem, which yields the variation across multiple copies of chromosomes. Variations among fast evolving viruses that lead to different strains (called quasispecies) are also deciphered with similar approaches. In both these cases, high-throughput sequencing technologies that provide oversampled mixtures of large noisy fragments (reads) of genomes, are used to infer constituent components (haplotypes or quasispecies). The problem is harder for polyploid species where there are more than two copies of chromosomes. State-of-the-art neural approaches to solve this NP-hard problem do not adequately model relations among the reads that are important for deconvolving the input signal. We address this problem by developing a new method, called NeurHap, that combines graph representation learning with combinatorial optimization. Our experiments demonstrate substantially better performance of NeurHap in real and synthetic datasets compared to competing approaches.
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > California (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
A Graph Auto-Encoder for Haplotype Assembly and Viral Quasispecies Reconstruction
Reconstructing components of a genomic mixture from data obtained by means of DNA sequencing is a challenging problem encountered in a variety of applications including single individual haplotyping and studies of viral communities. High-throughput DNA sequencing platforms oversample mixture components to provide massive amounts of reads whose relative positions can be determined by mapping the reads to a known reference genome; assembly of the components, however, requires discovery of the reads' origin -- an NP-hard problem that the existing methods struggle to solve with the required level of accuracy. In this paper, we present a learning framework based on a graph auto-encoder designed to exploit structural properties of sequencing data. The algorithm is a neural network which essentially trains to ignore sequencing errors and infers the posteriori probabilities of the origin of sequencing reads. Mixture components are then reconstructed by finding consensus of the reads determined to originate from the same genomic component. Results on realistic synthetic as well as experimental data demonstrate that the proposed framework reliably assembles haplotypes and reconstructs viral communities, often significantly outperforming state-of-the-art techniques.
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)