MotifBench: A standardized protein design benchmark for motif-scaffolding problems
Zheng, Zhuoqi, Zhang, Bo, Didi, Kieran, Yang, Kevin K., Yim, Jason, Watson, Joseph L., Chen, Hai-Feng, Trippe, Brian L.
–arXiv.org Artificial Intelligence
The motif-scaffolding problem is a central task in computational protein design: Given the coordinates of atoms in a geometry chosen to confer a desired biochemical function (a motif), the task is to identify diverse protein structures (scaffolds) that include the motif and maintain its geometry. Significant recent progress on motif-scaffolding has been made due to computational evaluation with reliable protein structure prediction and fixed-backbone sequence design methods [1-17]. However, significant variability in evaluation strategies across publications has hindered comparability of results, challenged reproducibility, and impeded robust progress. In response we introduce MotifBench, comprising (1) a precisely specified pipeline and evaluation metrics, (2) a collection of 30 benchmark problems, and (3) an implementation of this benchmark and leaderboard at github.com/blt2114/MotifBench. The MotifBench test cases are more difficult compared to earlier benchmarks (e.g. A motif-scaffolding method takes a motif as input and returns a set of putatively compatible scaffolds as output. This section details how motifs and scaffolds in MotifBench are specified, proposes metrics by which a scaffold set is evaluated, and describes how these metrics are computed. Appendix A describes considerations upon which these specifications and metrics were chosen. Motif specification (inputs): A motif is specified by the coordinates of the backbone atoms of several residues and (in some cases) the amino acid types of a subset of those residues.
arXiv.org Artificial Intelligence
Feb-17-2025