Task Scarcity and Label Leakage in Relational Transfer Learning
Abstract
Training relational foundation models requires learning representations that transfer across tasks, yet available supervision is typically limited to a small number of prediction targets per database. This task scarcity causes learned representations to encode task-specific shortcuts that degrade transfer even within the same schema, a problem we call label leakage. We study this using K-Space, a modular architecture combining frozen pretrained tabular encoders with a lightweight message-passing core. To suppress leakage, we introduce a gradient projection method that removes label-predictive directions from representation updates. On RelBench, this improves within-dataset transfer by +0.145 AUROC on average, often recovering near single-task performance. Our results suggest that limited task diversity, not just limited data, constrains relational foundation models.
Source: arXiv:2603.29914v1 - http://arxiv.org/abs/2603.29914v1 PDF: https://arxiv.org/pdf/2603.29914v1 Original Link: http://arxiv.org/abs/2603.29914v1