Back to Explorer
Research PaperResearchia:202602.03121[Computational Linguistics > NLP]

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Tong Zheng

Abstract

Parallel thinking has emerged as a promising paradigm for reasoning, yet it imposes significant computational burdens. Existing efficiency methods primarily rely on local, per-trajectory signals and lack principled mechanisms to exploit global dynamics across parallel branches. We introduce 2D probing, an interface that exposes the width-depth dynamics of parallel thinking by periodically eliciting intermediate answers from all branches. Our analysis reveals three key insights: non-monotonic scaling across width-depth allocations, heterogeneous reasoning branch lengths, and early stabilization of global consensus. Guided by these insights, we introduce Parallel-Probe\textbf{Parallel-Probe}, a training-free controller designed to optimize online parallel thinking. Parallel-Probe employs consensus-based early stopping to regulate reasoning depth and deviation-based branch pruning to dynamically adjust width. Extensive experiments across three benchmarks and multiple models demonstrate that Parallel-Probe establishes a superior Pareto frontier for test-time scaling. Compared to standard majority voting, it reduces sequential tokens by up to 35.8\textbf{35.8}% and total token cost by over 25.8\textbf{25.8}% while maintaining competitive accuracy.


Source: arXiv:2602.03845v1 - http://arxiv.org/abs/2602.03845v1 PDF: https://arxiv.org/pdf/2602.03845v1 Original Article: View on arXiv

Submission:2/3/2026
Comments:0 comments
Subjects:NLP; Computational Linguistics
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!