ExplorerComputer ScienceComputer Science
Research PaperResearchia:202601.12945434

ReinPool: Reinforcement Learning Pooling Multi-Vector Embeddings for Retrieval System

Sungguk Cha

Abstract

Multi-vector embedding models have emerged as a powerful paradigm for document retrieval, preserving fine-grained visual and textual details through token-level representations. However, this expressiveness comes at a staggering cost: storing embeddings for every token inflates index sizes by over $1000\times$ compared to single-vector approaches, severely limiting scalability. We introduce \textbf{ReinPool}, a reinforcement learning framework that learns to dynamically filter and pool multi-vec...

Submitted: January 12, 2026Subjects: Computer Science; Computer Science

Description / Details

Multi-vector embedding models have emerged as a powerful paradigm for document retrieval, preserving fine-grained visual and textual details through token-level representations. However, this expressiveness comes at a staggering cost: storing embeddings for every token inflates index sizes by over 1000×1000\times compared to single-vector approaches, severely limiting scalability. We introduce \textbf{ReinPool}, a reinforcement learning framework that learns to dynamically filter and pool multi-vector embeddings into compact, retrieval-optimized representations. By training with an inverse retrieval objective and NDCG-based rewards, ReinPool identifies and retains only the most discriminative vectors without requiring manual importance annotations. On the Vidore V2 benchmark across three vision-language embedding models, ReinPool compresses multi-vector representations by 746746--1249×1249\times into single vectors while recovering 76--81% of full multi-vector retrieval performance. Compared to static mean pooling baselines, ReinPool achieves 22--33% absolute NDCG@3 improvement, demonstrating that learned selection significantly outperforms heuristic aggregation.

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jan 12, 2026
Topic:
Computer Science
Area:
Computer Science
Comments:
0
Bookmark
ReinPool: Reinforcement Learning Pooling Multi-Vector Embeddings for Retrieval System | Researchia