Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training

2026-02-24 • Machine Learning

Machine LearningArtificial Intelligence

AI summaryⓘ

The authors study a common problem in measuring language models when multiple guesses are allowed (pass@k) versus just one guess (pass@1). They find that improving the chance of success when trying multiple answers (pass@k) can actually harm the chance of success with a single answer (pass@1). This happens because the model’s learning updates for multiple guesses conflict with those for a single guess due to how different prompts are weighed in training. They support their explanation with experiments on math reasoning tasks using large language models.

pass@kpass@1large language modelspolicy gradientprompt interferenceinference-aware fine-tuningverifiable tasksmathematical reasoningmulti-sample inferencegradient conflict

Authors

Anas Barakat, Souradip Chakraborty, Khushbu Pahwa, Amrit Singh Bedi

Abstract

Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of $k$ independently sampled solutions passes a verifier. This multi-sample inference metric has motivated inference-aware fine-tuning methods that directly optimize pass@$k$. However, prior work reports a recurring trade-off: pass@k improves while pass@1 degrades under such methods. This trade-off is practically important because pass@1 often remains a hard operational constraint due to latency and cost budgets, imperfect verifier coverage, and the need for a reliable single-shot fallback. We study the origin of this trade-off and provide a theoretical characterization of when pass@k policy optimization can reduce pass@1 through gradient conflict induced by prompt interference. We show that pass@$k$ policy gradients can conflict with pass@1 gradients because pass@$k$ optimization implicitly reweights prompts toward low-success prompts; when these prompts are what we term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction. We illustrate our theoretical findings with large language model experiments on verifiable mathematical reasoning tasks.

View PDFOpen arXiv