e2efold
Machine Learning Tool May Help Us Better Understand RNA Viruses
Although the model has yet to be used in real-life applications, in research testing it has shown at least a 10 percent improvement in structure prediction accuracy compared to previous state-of-the-art methods according to Xinshi Chen, a Georgia Tech Ph.D. student specializing in machine learning and co-developer of the new tool. "The model uses an unrolled algorithm for solving a constrained optimization as a component in the neural network architecture, so that it can directly incorporate a solution constraint, or prior knowledge, to predict the RNA base-pairing matrix," said Chen. E2Efold is not only more accurate, it is also considerably faster than current techniques. Current methods are dynamic programming based, which is a much slower approach for predicting longer RNA sequences, such as the genomic RNA in a virus. E2Efold overcomes this drawback by using a gradient-based unrolled algorithm.
Machine learning tool may help us better understand RNA viruses
E2Efold is an end-to-end deep learning model developed at Georgia Tech that can predict RNA secondary structures, an important task used in virus analysis, drug design, and other public health applications. Although the model has yet to be used in real-life applications, in research testing it has shown at least a 10 percent improvement in structure prediction accuracy compared to previous state-of-the-art methods according to Xinshi Chen, a Georgia Tech Ph.D. student specializing in machine learning and co-developer of the new tool. "The model uses an unrolled algorithm for solving a constrained optimization as a component in the neural network architecture, so that it can directly incorporate a solution constraint, or prior knowledge, to predict the RNA base-pairing matrix," said Chen. E2Efold is not only more accurate, it is also considerably faster than current techniques. Current methods are dynamic programming based, which is a much slower approach for predicting longer RNA sequences, such as the genomic RNA in a virus.
RNA Secondary Structure Prediction By Learning Unrolled Algorithms
Chen, Xinshi, Li, Yu, Umarov, Ramzan, Gao, Xin, Song, Le
In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With comprehensive experiments on benchmark datasets, we demonstrate the superior performance of E2Efold: it predicts significantly better structures compared to previous SOTA (especially for pseudoknotted structures), while being as efficient as the fastest algorithms in terms of inference time.