Back to Explorer
Research PaperResearchia:202603.03058[Data Science > Machine Learning]

Adaptive Combinatorial Experimental Design: Pareto Optimality for Decision-Making and Inference

Hongrui Xie

Abstract

In this paper, we provide the first investigation into adaptive combinatorial experimental design, focusing on the trade-off between regret minimization and statistical power in combinatorial multi-armed bandits (CMAB). While minimizing regret requires repeated exploitation of high-reward arms, accurate inference on reward gaps requires sufficient exploration of suboptimal actions. We formalize this trade-off through the concept of Pareto optimality and establish equivalent conditions for Pareto-efficient learning in CMAB. We consider two relevant cases under different information structures, i.e., full-bandit feedback and semi-bandit feedback, and propose two algorithms MixCombKL and MixCombUCB respectively for these two cases. We provide theoretical guarantees showing that both algorithms are Pareto optimal, achieving finite-time guarantees on both regret and estimation error of arm gaps. Our results further reveal that richer feedback significantly tightens the attainable Pareto frontier, with the primary gains arising from improved estimation accuracy under our proposed methods. Taken together, these findings establish a principled framework for adaptive combinatorial experimentation in multi-objective decision-making.


Source: arXiv:2602.24231v1 - http://arxiv.org/abs/2602.24231v1 PDF: https://arxiv.org/pdf/2602.24231v1 Original Link: http://arxiv.org/abs/2602.24231v1

Submission:3/3/2026
Comments:0 comments
Subjects:Machine Learning; Data Science
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Adaptive Combinatorial Experimental Design: Pareto Optimality for Decision-Making and Inference | Researchia | Researchia