ExplorerPharmaceutical ResearchBiochemistry
Research PaperResearchia:202603.23021

Reinforcement-guided generative protein language models enable de novo design of highly diverse AAV capsids

Lucas Ferraz

Abstract

Adeno-associated viral (AAV) vectors are widely used delivery platforms in gene therapy, and the design of improved capsids is key to expanding their therapeutic potential. A central challenge in AAV bioengineering, as in protein design more broadly, is the vast sequence design space relative to the scale of feasible experimental screening. Machine-guided generative approaches provide a powerful means of navigating this landscape and proposing novel protein sequences that satisfy functional cons...

Submitted: March 23, 2026Subjects: Biochemistry; Pharmaceutical Research

Description / Details

Adeno-associated viral (AAV) vectors are widely used delivery platforms in gene therapy, and the design of improved capsids is key to expanding their therapeutic potential. A central challenge in AAV bioengineering, as in protein design more broadly, is the vast sequence design space relative to the scale of feasible experimental screening. Machine-guided generative approaches provide a powerful means of navigating this landscape and proposing novel protein sequences that satisfy functional constraints. Here, we develop a generative design framework based on protein language models and reinforcement learning to generate highly novel yet functionally plausible AAV capsids. A pretrained model was fine-tuned on experimentally validated capsid sequences to learn patterns associated with viability. Reinforcement learning was then used to guide sequence generation, with a reward function that jointly promoted predicted viability and sequence novelty, thereby enabling exploration beyond regions represented in the training data. Comparative analyses showed that fine-tuning alone produces sequences with high predicted viability but remains biased toward the training distribution, whereas reinforcement learining-guided generation reaches more distant regions of sequence space while maintaining high predicted viability. Finally, we propose a candidate selection strategy that integrates predicted viability, sequence novelty, and biophysical properties to prioritize variants for downstream evaluation. This work establishes a framework for the generative exploration of protein sequence space and advances the application of generative protein language models to AAV bioengineering.


Source: arXiv:2603.19473v1 - http://arxiv.org/abs/2603.19473v1 PDF: https://arxiv.org/pdf/2603.19473v1 Original Link: http://arxiv.org/abs/2603.19473v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Mar 23, 2026
Topic:
Pharmaceutical Research
Area:
Biochemistry
Comments:
0
Bookmark
Reinforcement-guided generative protein language models enable de novo design of highly diverse AAV capsids | Researchia