Back to Explorer
Research PaperResearchia:202603.12018[Biotechnology > Biology]

Continuous Diffusion Transformers for Designing Synthetic Regulatory Elements

Jonathan Liu

Abstract

We present a parameter-efficient Diffusion Transformer (DiT) for generating 200bp cell-type-specific regulatory DNA sequences. By replacing the U-Net backbone of DNA-Diffusion with a transformer denoiser equipped with a 2D CNN input encoder, our model matches the U-Net's best validation loss in 13 epochs (60ร—\times fewer) and converges 39% lower, while reducing memorization from 5.3% to 1.7% of generated sequences aligning to training data via BLAT. Ablations show the CNN encoder is essential: without it, validation loss increases 70% regardless of positional embedding choice. We further apply DDPO finetuning using Enformer as a reward model, achieving a 38ร—\times improvement in predicted regulatory activity. Cross-validation against DRAKES on an independent prediction task confirms that improvements reflect genuine regulatory signal rather than reward model overfitting.


Source: arXiv:2603.10885v1 - http://arxiv.org/abs/2603.10885v1 PDF: https://arxiv.org/pdf/2603.10885v1 Original Link: http://arxiv.org/abs/2603.10885v1

Submission:3/12/2026
Comments:0 comments
Subjects:Biology; Biotechnology
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!