Kermut: Composite kernel regression for protein variant effects
Groth, Peter Mørch, Kerrn, Mads Herbert, Olsen, Lars, Salomon, Jesper, Boomsma, Wouter
–arXiv.org Artificial Intelligence
Reliable prediction of protein variant effects is crucial for both protein optimization and for advancing biological understanding. For practical use in protein engineering, it is important that we can also provide reliable uncertainty estimates for our predictions, and while prediction accuracy has seen much progress in recent years, uncertainty metrics are rarely reported. We here provide a Gaussian process regression model, Kermut, with a novel composite kernel for modelling mutation similarity, which obtains state-of-the-art performance for protein variant effect prediction while also offering estimates of uncertainty through its posterior. An analysis of the quality of the uncertainty estimates demonstrates that our model provides meaningful levels of overall calibration, but that instance-specific uncertainty calibration remains more challenging. We hope that this will encourage future work in this promising direction.
arXiv.org Artificial Intelligence
Jul-9-2024
- Country:
- Asia > Middle East
- Lebanon > Keserwan-Jbeil Governorate > Blat (0.05)
- Europe
- Denmark > Capital Region
- Copenhagen (0.04)
- Kongens Lyngby (0.14)
- France (0.04)
- Denmark > Capital Region
- North America > United States
- California > San Francisco County
- San Francisco (0.14)
- Massachusetts > Middlesex County
- Cambridge (0.14)
- New York > New York County
- New York City (0.04)
- California > San Francisco County
- Asia > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Technology: