Directed Evolution of Proteins via Bayesian Optimization in Embedding Space

Soldát, Matouš, Kléma, Jiří

arXiv.org Artificial Intelligence 

Abstract--Directed evolution is an iterative laboratory process of designing proteins with improved function by iteratively synthesizing new protein variants and evaluating their desired property with expensive and time-consuming biochemical screening. Machine learning methods can help select informative or promising variants for screening to increase their quality and reduce the amount of necessary screening. In this paper, we present a novel method for machine-learning-assisted directed evolution of proteins which combines Bayesian optimization with informative representation of protein variants extracted from a pre-trained protein language model. We demonstrate that the new representation based on the sequence embeddings significantly improves the performance of Bayesian optimization yielding better results with the same number of conducted screening in total. At the same time, our method outperforms the state-of-the-art machine-learning-assisted directed evolution methods with regression objective. Protein engineering (PE) is the process of designing proteins with desired properties, such as improved stability, catalytic function, or specific binding affinity [1]. PE can be leveraged in industrial applications, environmental applications, medicine, nanobiotechnology, and other fields [1]. Because the functional properties of proteins are determined by their sequence of amino acids [2], the task of PE translates to finding a sequence of amino acids with the desired properties/function.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found