Protein sequence design with deep generative models

Wu, Zachary, Johnston, Kadina E., Arnold, Frances H., Yang, Kevin K.

arXiv.org Machine Learning 

These macromolecules are encoded as linear chains of amino acids, which then fold into dynamic 3-dimensional structures that accomplish a staggering variety of functions. To improve proteins for human purposes, protein engineers have developed a variety of experimental and computational methods for designing sequences that fold to desired structures or perform desired functions [1, 2, 3, 4]. A developing paradigm, machine learning-guided protein engineering, promises to leverage the information obtained from wet-lab experiments with data-driven models to more efficiently find desirable proteins [5, 6, 7]. Much of the early work has focused on incorporating discriminative models trained on measured sequence-fitness pairs to guide protein engineering [5]. However, methods that can take advantage of unlabeled protein sequences are improving the protein engineering paradigm.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found