Flexible Kernels for Protein Property Prediction
Abstract
Despite its importance to applications in protein design, predicting protein properties like binding affinity and thermostability from sparse experimental data remains a significant challenge. Accordingly, we introduce a class of sequence kernels that exploit evolutionary substitution matrices as well as local linearity and demonstrate that the resulting Gaussian processes provide data-efficient models of protein property landscapes, frequently outperforming alternatives that rely on foundation ...
Description / Details
Despite its importance to applications in protein design, predicting protein properties like binding affinity and thermostability from sparse experimental data remains a significant challenge. Accordingly, we introduce a class of sequence kernels that exploit evolutionary substitution matrices as well as local linearity and demonstrate that the resulting Gaussian processes provide data-efficient models of protein property landscapes, frequently outperforming alternatives that rely on foundation model embeddings. Furthermore--by learning what are in effect structure-aware substitution matrices--we show that our kernels can readily incorporate structural information from foundation models. We demonstrate that these structure-conditioned kernels are well suited to multi-task learning across multiple protein property landscapes and can decisively outperform local supervised learning methods.
Source: arXiv:2606.11057v1 - http://arxiv.org/abs/2606.11057v1 PDF: https://arxiv.org/pdf/2606.11057v1 Original Link: http://arxiv.org/abs/2606.11057v1
Please sign in to join the discussion.
No comments yet. Be the first to share your thoughts!
Jun 10, 2026
Pharmaceutical Research
Biochemistry
0