Back to Explorer
Research PaperResearchia:202601.29202[Statistics & ML > Statistics]

More Bang for the Buck: Improving the Inference of Large Language Models at a Fixed Budget using Reset and Discard (ReD)

Sagi Meir

Abstract

The performance of large language models (LLMs) on verifiable tasks is usually measured by pass@k, the probability of answering a question correctly at least once in k trials. At a fixed budget, a more suitable metric is coverage@cost, the average number of unique questions answered as a function of the total number of attempts. We connect the two metrics and show that the empirically-observed power-law behavior in pass@k leads to a sublinear growth of the coverage@cost (diminishing returns). To solve this problem, we propose Reset-and-Discard (ReD), a query method of LLMs that increases coverage@cost for any given budget, regardless of the pass@k form. Moreover, given a pass@k, we can quantitatively predict the savings in the total number of attempts using ReD. If pass@k is not available for the model, ReD can infer its power-law exponent. Experiments on three LLMs using HumanEval demonstrate that ReD substantially reduces the required attempts, tokens, and USD cost to reach a desired coverage, while also offering an efficient way to measure inference power-laws.


Source: arXiv:2601.21522v1 - http://arxiv.org/abs/2601.21522v1 PDF: https://arxiv.org/pdf/2601.21522v1 Original Link: http://arxiv.org/abs/2601.21522v1

Submission:1/29/2026
Comments:0 comments
Subjects:Statistics; Statistics & ML
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

More Bang for the Buck: Improving the Inference of Large Language Models at a Fixed Budget using Reset and Discard (ReD) | Researchia