Explorerβ€ΊChemistryβ€ΊChemistry
Research PaperResearchia:202605.14041

Reducing cross-sample prediction churn in scientific machine learning

Gordan Prastalo

Abstract

Scientific machine learning reports predictive performance. It does not report whether the same prediction would survive a different draw of training data. Across $9$ chemistry benchmarks, two classifiers trained on independent bootstraps of the same training set agree on aggregate accuracy to within $1.3\text{--}4.2$ percentage points but disagree on the class label of $8.0\text{--}21.8\%$ of test molecules. We call this gap \emph{cross-sample prediction churn}. The standard parameter-side tech...

Submitted: May 14, 2026Subjects: Chemistry; Chemistry

Description / Details

Scientific machine learning reports predictive performance. It does not report whether the same prediction would survive a different draw of training data. Across 99 chemistry benchmarks, two classifiers trained on independent bootstraps of the same training set agree on aggregate accuracy to within 1.3–4.21.3\text{--}4.2 percentage points but disagree on the class label of 8.0–21.8%8.0\text{--}21.8\% of test molecules. We call this gap \emph{cross-sample prediction churn}. The standard parameter-side techniques (deep ensembles, MC dropout, stochastic weight averaging) do not reduce this gap; two data-side methods do. The first is KK-bootstrap bagging, which cuts the rate 40–54%40\text{--}54\% on every dataset at no accuracy cost (KΓ—K{\times}-ERM compute). The second is \emph{twin-bootstrap}, our proposal: two networks trained jointly on independent bootstraps with a sym-KL consistency loss between their predictions, which at matched 2Γ—2{\times}-ERM compute reduces churn a further median 45%45\% beyond bagging-K=2K{=}2. Cross-sample prediction churn deserves a column alongside predictive performance in scientific-ML benchmark reports, because without it the parameter-side and data-side methods are indistinguishable on the metric they actually differ on.


Source: arXiv:2605.13826v1 - http://arxiv.org/abs/2605.13826v1 PDF: https://arxiv.org/pdf/2605.13826v1 Original Link: http://arxiv.org/abs/2605.13826v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 14, 2026
Topic:
Chemistry
Area:
Chemistry
Comments:
0
Bookmark