ExplorerData ScienceMachine Learning
Research PaperResearchia:202605.29018

Efficient Test-Time Finetuning of LLMs via Convex Reconstruction and Gradient Caching

Alaa Khamis

Abstract

Test-time finetuning (TTFT) is a rapidly evolving paradigm that adapts a language model to each prompt by retrieving related sequences, updating the model on them, and then evaluating the prompt. However, TTFT is only practical if it is fast: selection and finetuning both happen per query, making each a direct bottleneck. Existing methods trade speed for quality: fast retrieval is often redundant, while stronger diversity-aware selection adds prohibitive per-query cost. We introduce HullFT, a ge...

Submitted: May 29, 2026Subjects: Machine Learning; Data Science

Description / Details

Test-time finetuning (TTFT) is a rapidly evolving paradigm that adapts a language model to each prompt by retrieving related sequences, updating the model on them, and then evaluating the prompt. However, TTFT is only practical if it is fast: selection and finetuning both happen per query, making each a direct bottleneck. Existing methods trade speed for quality: fast retrieval is often redundant, while stronger diversity-aware selection adds prohibitive per-query cost. We introduce HullFT, a geometric approach to TTFT that addresses both bottlenecks. Given a query, HullFT first represents the query embedding as a sparse convex combination of few training sequences, using efficient projection-free Frank-Wolfe optimization. This yields a support set that is inherently relevant and diverse. We then convert the fractional convex weights into an exact integer multiset for finetuning through a geometric integerization procedure. The resulting multiplicities naturally create repeated examples, which we exploit with Gradient Reuse to amortize forward-backward computation across repeated finetuning steps. Our experiments show that HullFT improves the quality-efficiency tradeoff over current state-of-the-art TTFT methods, achieving lower bits-per-byte at substantially lower total runtime.


Source: arXiv:2605.30337v1 - http://arxiv.org/abs/2605.30337v1 PDF: https://arxiv.org/pdf/2605.30337v1 Original Link: http://arxiv.org/abs/2605.30337v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 29, 2026
Topic:
Data Science
Area:
Machine Learning
Comments:
0
Bookmark
Efficient Test-Time Finetuning of LLMs via Convex Reconstruction and Gradient Caching | Researchia