Feature Selection Methods for Improving Protein Structure Prediction with Rosetta
Blum, Ben, Baker, David, Jordan, Michael I., Bradley, Philip, Das, Rhiju, Kim, David E.
–Neural Information Processing Systems
Rosetta is one of the leading algorithms for protein structure prediction today. It is a Monte Carlo energy minimization method requiring many random restarts to find structures with low energy. In this paper we present a resampling technique for structure prediction of small alpha/beta proteins using Rosetta. From an initial roundof Rosetta sampling, we learn properties of the energy landscape that guide a subsequent round of sampling toward lower-energy structures. Rather than attempt to fit the full energy landscape, we use feature selection methods--both L1-regularized linear regression and decision trees--to identify structural features that give rise to low energy. We then enrich these structural features in the second sampling round. Results are presented across a benchmark set of nine small alpha/beta proteinsdemonstrating that our methods seldom impair, and frequently improve, Rosetta's performance.
Neural Information Processing Systems
Dec-31-2008
- Country:
- Asia > Middle East
- Jordan (0.14)
- North America > United States
- California (0.14)
- Washington > King County
- Seattle (0.14)
- Asia > Middle East
- Industry:
- Technology: