Back to Explorer
Research PaperResearchia:202602.01002[Biotechnology > Biology]

Toward Interpretable and Generalizable AI in Regulatory Genomics

Masayuki Nagai

Abstract

Deciphering how DNA sequence encodes gene regulation remains a central challenge in biology. Advances in machine learning and functional genomics have enabled sequence-to-function (seq2func) models that predict molecular regulatory readouts directly from DNA sequence. These models are now widely used for variant effect prediction, mechanistic interpretation, and regulatory sequence design. Despite strong performance on held-out genomic regions, their ability to generalize across genetic variation and cellular contexts remains inconsistent. Here we examine how architectural choices, training data, and prediction tasks shape the behavior of seq2func models. We synthesize how interpretability methods and evaluation practices have probed learned cis-regulatory organization and highlighted systematic failure modes, clarifying why strong predictive accuracy can fail to translate into robust regulatory understanding. We argue that progress will require reframing seq2func models as continually refined systems, in which targeted perturbation experiments, systematic evaluation, and iterative model updates are tightly coupled through AI-experiment feedback loops. Under this framework, seq2func models become self-improving tools that progressively deepen their mechanistic grounding and more reliably support biological discovery.


Source: arXiv:2602.01230v1 - http://arxiv.org/abs/2602.01230v1 PDF: https://arxiv.org/pdf/2602.01230v1 Original Article: View on arXiv

Submission:2/1/2026
Comments:0 comments
Subjects:Biology; Biotechnology
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Toward Interpretable and Generalizable AI in Regulatory Genomics | Researchia | Researchia