Explorerโ€บPharmaceutical Researchโ€บBiochemistry
Research PaperResearchia:202604.14022

Biologically-Grounded Multi-Encoder Architectures as Developability Oracles for Antibody Design

Simon J. Crouzet

Abstract

Generative models can now propose thousands of \emph{de novo} antibody sequences, yet translating these designs into viable therapeutics remains constrained by the cost of biophysical characterization. Here we present CrossAbSense, a framework of property-specific neural oracles that combine frozen protein language model encoders with configurable attention decoders, identified through a systematic hyperparameter campaign totaling over 200 runs per property. On the GDPa1 benchmark of 242 therape...

Submitted: April 14, 2026Subjects: Biochemistry; Pharmaceutical Research

Description / Details

Generative models can now propose thousands of \emph{de novo} antibody sequences, yet translating these designs into viable therapeutics remains constrained by the cost of biophysical characterization. Here we present CrossAbSense, a framework of property-specific neural oracles that combine frozen protein language model encoders with configurable attention decoders, identified through a systematic hyperparameter campaign totaling over 200 runs per property. On the GDPa1 benchmark of 242 therapeutic IgGs, our oracles achieve notable improvements of 12--20% over established baselines on three of five developability assays and competitive performance on the remaining two. The central finding is that optimal decoder architectures \emph{invert} our initial biological hypotheses: self-attention alone suffices for aggregation-related properties (hydrophobic interaction chromatography, polyreactivity), where the relevant sequence signatures -- such as CDR-H3 hydrophobic patches -- are already fully resolved within single-chain embeddings by the high-capacity 6B encoder. Bidirectional cross-attention, by contrast, is required for expression yield and thermal stability -- properties that inherently depend on the compatibility between heavy and light chains. Learned chain fusion weights independently confirm heavy-chain dominance in aggregation (wH=0.62w_H = 0.62) versus balanced contributions for stability (wH=0.51w_H = 0.51). We demonstrate practical utility by deploying CrossAbSense on 100 IgLM-generated antibody designs, illustrating a path toward substantial reduction in experimental screening costs.


Source: arXiv:2604.09369v1 - http://arxiv.org/abs/2604.09369v1 PDF: https://arxiv.org/pdf/2604.09369v1 Original Link: http://arxiv.org/abs/2604.09369v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Apr 14, 2026
Topic:
Pharmaceutical Research
Area:
Biochemistry
Comments:
0
Bookmark