ExplorerData ScienceMachine Learning
Research PaperResearchia:202604.09071

How to sketch a learning algorithm

Sam Gunn

Abstract

How does the choice of training data influence an AI model? This question is of central importance to interpretability, privacy, and basic science. At its core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ in the deep ...

Submitted: April 9, 2026Subjects: Machine Learning; Data Science

Description / Details

How does the choice of training data influence an AI model? This question is of central importance to interpretability, privacy, and basic science. At its core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error ε\varepsilon in the deep learning setting. Our precomputation and prediction algorithms are only poly(1/ε)\mathrm{poly}(1/\varepsilon) factors slower than regular training and inference, respectively. The storage requirements are those of poly(1/ε)\mathrm{poly}(1/\varepsilon) models. Our proof is based on an assumption that we call "stability." In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.


Source: arXiv:2604.07328v1 - http://arxiv.org/abs/2604.07328v1 PDF: https://arxiv.org/pdf/2604.07328v1 Original Link: http://arxiv.org/abs/2604.07328v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Apr 9, 2026
Topic:
Data Science
Area:
Machine Learning
Comments:
0
Bookmark
How to sketch a learning algorithm | Researchia