Back to Explorer
Research PaperResearchia:202602.20018[Biotechnology > Biology]

JEPA-DNA: Grounding Genomic Foundation Models through Joint-Embedding Predictive Architectures

Ariel Larey

Abstract

Genomic Foundation Models (GFMs) have largely relied on Masked Language Modeling (MLM) or Next Token Prediction (NTP) to learn the language of life. While these paradigms excel at capturing local genomic syntax and fine-grained motif patterns, they often fail to capture the broader functional context, resulting in representations that lack a global biological perspective. We introduce JEPA-DNA, a novel pre-training framework that integrates the Joint-Embedding Predictive Architecture (JEPA) with traditional generative objectives. JEPA-DNA introduces latent grounding by coupling token-level recovery with a predictive objective in the latent space by supervising a CLS token. This forces the model to predict the high-level functional embeddings of masked genomic segments rather than focusing solely on individual nucleotides. JEPA-DNA extends both NTP and MLM paradigms and can be deployed either as a standalone from-scratch objective or as a continual pre-training enhancement for existing GFMs. Our evaluations across a diverse suite of genomic benchmarks demonstrate that JEPA-DNA consistently yields superior performance in supervised and zero-shot tasks compared to generative-only baselines. By providing a more robust and biologically grounded representation, JEPA-DNA offers a scalable path toward foundation models that understand not only the genomic alphabet, but also the underlying functional logic of the sequence.


Source: arXiv:2602.17162v1 - http://arxiv.org/abs/2602.17162v1 PDF: https://arxiv.org/pdf/2602.17162v1 Original Link: http://arxiv.org/abs/2602.17162v1

Submission:2/20/2026
Comments:0 comments
Subjects:Biology; Biotechnology
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

JEPA-DNA: Grounding Genomic Foundation Models through Joint-Embedding Predictive Architectures | Researchia | Researchia