Back to Explorer
Research PaperResearchia:202602.26001[Artificial Intelligence > AI]

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training

Anas Barakat

Abstract

Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of kk independently sampled solutions passes a verifier. This multi-sample inference metric has motivated inference-aware fine-tuning methods that directly optimize pass@kk. However, prior work reports a recurring trade-off: pass@k improves while pass@1 degrades under such methods. This trade-off is practically important because pass@1 often remains a hard operational constraint due to latency and cost budgets, imperfect verifier coverage, and the need for a reliable single-shot fallback. We study the origin of this trade-off and provide a theoretical characterization of when pass@k policy optimization can reduce pass@1 through gradient conflict induced by prompt interference. We show that pass@kk policy gradients can conflict with pass@1 gradients because pass@kk optimization implicitly reweights prompts toward low-success prompts; when these prompts are what we term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction. We illustrate our theoretical findings with large language model experiments on verifiable mathematical reasoning tasks.


Source: arXiv:2602.21189v1 - http://arxiv.org/abs/2602.21189v1 PDF: https://arxiv.org/pdf/2602.21189v1 Original Link: http://arxiv.org/abs/2602.21189v1

Submission:2/26/2026
Comments:0 comments
Subjects:AI; Artificial Intelligence
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training | Researchia | Researchia