RareCollab -- An Agentic System Diagnosing Mendelian Disorders with Integrated Phenotypic and Molecular Evidence
Abstract
Millions of children worldwide are affected by severe rare Mendelian disorders, yet exome and genome sequencing still fail to provide a definitive molecular diagnosis for a large fraction of patients, prolonging the diagnostic odyssey. Bridging this gap increasingly requires transitioning from DNA-only interpretation to multi-modal diagnostic reasoning that combines genomic data, transcriptomic sequencing (RNA-seq), and phenotype information; however, computational frameworks that coherently integrate these signals remain limited. Here we present RareCollab, an agentic diagnostic framework that pairs a stable quantitative Diagnostic Engine with Large Language Model (LLM)-based specialist modules that produce high-resolution, interpretable assessments from transcriptomic signals, phenotypes, variant databases, and the literature to prioritize potential diagnostic variants. In a rigorously curated benchmark of Undiagnosed Diseases Network (UDN) patients with paired genomic and transcriptomic data, RareCollab achieved 77% top-5 diagnostic accuracy and improved top-1 to top-5 accuracy by ~20% over widely used variant-prioritization approaches. RareCollab illustrates how modular artificial intelligence (AI) can operationalize multi-modal evidence for accurate, scalable rare disease diagnosis, offering a promising path toward reducing the diagnostic odyssey for affected families.
Source: arXiv:2602.04058v1 - http://arxiv.org/abs/2602.04058v1 PDF: https://arxiv.org/pdf/2602.04058v1 Original Article: View on arXiv