Back to Explorer
Research PaperResearchia:202601.21008[Genomics > Biology]

SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

Xianghao Zhan

Abstract

Spatial transcriptomics enables spatial gene expression profiling, motivating computational models that capture spatially conditioned regulatory relationships. We introduce SAGE-FM, a lightweight spatial transcriptomics foundation model based on graph convolutional networks (GCNs) trained with a masked central spot prediction objective. Trained on 416 human Visium samples spanning 15 organs, SAGE-FM learns spatially coherent embeddings that robustly recover masked genes, with 91% of masked genes showing significant correlations (p < 0.05). The embeddings generated by SAGE-FM outperform MOFA and existing spatial transcriptomics methods in unsupervised clustering and preservation of biological heterogeneity. SAGE-FM generalizes to downstream tasks, enabling 81% accuracy in pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and improving glioblastoma subtype prediction relative to MOFA. In silico perturbation experiments further demonstrate that the model captures directional ligand-receptor and upstream-downstream regulatory effects consistent with ground truth. These results demonstrate that simple, parameter-efficient GCNs can serve as biologically interpretable and spatially aware foundation models for large-scale spatial transcriptomics.


Source: arXiv:2601.15504v1 - http://arxiv.org/abs/2601.15504v1 PDF: https://arxiv.org/pdf/2601.15504v1 Original Link: http://arxiv.org/abs/2601.15504v1

Submission:1/21/2026
Comments:0 comments
Subjects:Biology; Genomics
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!