ExplorerPharmaceutical ResearchBiochemistry
Research PaperResearchia:202605.11025

GoForth: Language Models for RNA Design under Structure, Sequence, and Coding Constraints

Michael Lindsey

Abstract

RNA inverse sequence design has broad biological and engineering applications, but computational methods for practical design queries remain limited. Such queries may impose several constraints at once, including target folds or motifs, fixed bases, and coding restrictions, while leaving arbitrary sequence and structure in unspecified regions. Because these constraints may permit many acceptable sequences, we study RNA design as a conditional generative modeling problem. The basic object is a co...

Submitted: May 11, 2026Subjects: Biochemistry; Pharmaceutical Research

Description / Details

RNA inverse sequence design has broad biological and engineering applications, but computational methods for practical design queries remain limited. Such queries may impose several constraints at once, including target folds or motifs, fixed bases, and coding restrictions, while leaving arbitrary sequence and structure in unspecified regions. Because these constraints may permit many acceptable sequences, we study RNA design as a conditional generative modeling problem. The basic object is a conditional law over RNA sequences given a user-specified condition, with full inverse folding as a special case. We introduce GoForth, a forward-trained RNA language model that conditions on structure, sequence, and coding targets. The formulation separates three ingredients that are often entangled in RNA design: a sequence prior, a forward folding sampler, and a reward or likelihood oracle. We train encoder-decoder models on witnessed folds rather than on outputs from an inverse-design teacher and validate our methodology on full inverse-folding benchmarks, as well as tasks involving constraints on structure, sequence, and coding. The resulting models achieve fast and high-quality candidate generation for mixed RNA design specifications. Moreover they furnish useful semantic embeddings of design tasks and a robust learned notion of designability.


Source: arXiv:2605.07608v1 - http://arxiv.org/abs/2605.07608v1 PDF: https://arxiv.org/pdf/2605.07608v1 Original Link: http://arxiv.org/abs/2605.07608v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 11, 2026
Topic:
Pharmaceutical Research
Area:
Biochemistry
Comments:
0
Bookmark