PhyloGFN: Phylogenetic inference with generative flow networks
Zhou, Mingyang, Yan, Zichao, Layne, Elliot, Malkin, Nikolay, Zhang, Dinghuai, Jain, Moksh, Blanchette, Mathieu, Bengio, Yoshua
Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and Bayesian phylogenetic inference. Because GFlowNets are well-suited for sampling complex combinatorial structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies and evolutionary distances. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets. PhyloGFN is competitive with prior works in marginal likelihood estimation and achieves a closer fit to the target distribution than state-of-the-art variational inference methods.
Oct-12-2023
- Country:
- North America > Canada > Quebec (0.28)
- Genre:
- Research Report (1.00)
- Industry:
- Technology: