Explorerβ€ΊData Scienceβ€ΊMachine Learning
Research PaperResearchia:202606.15006

A Complexity Measure for Active Learning in Multi-group Mean Estimation

Abdellah Aznag

Abstract

We study a \emph{max-risk} objective for active learning in a multi-group mean estimation $d$-armed bandits: a learner adaptively allocates a budget of $T$ samples across $d$ groups to minimize the worst-case uncertainty index $\max_{k\in[d]}Οƒ_k^2/n_k$, where $Οƒ_k$ is the standard deviation of the distribution of arm $d$, and $n_k$ is the number of times arm $d$ is sampled. We develop a local minimax framework and prove the first general lower bound for this objective, valid for any finite-varia...

Submitted: June 15, 2026Subjects: Machine Learning; Data Science

Description / Details

We study a \emph{max-risk} objective for active learning in a multi-group mean estimation dd-armed bandits: a learner adaptively allocates a budget of TT samples across dd groups to minimize the worst-case uncertainty index max⁑k∈[d]Οƒk2/nk\max_{k\in[d]}Οƒ_k^2/n_k, where ΟƒkΟƒ_k is the standard deviation of the distribution of arm dd, and nkn_k is the number of times arm dd is sampled. We develop a local minimax framework and prove the first general lower bound for this objective, valid for any finite-variance hypothesis class. The bound separates difficulty into three orthogonal factors: a \emph{budget} term, a \emph{heteroscedasticity} index measuring how unevenly the uncertainty is spread across arms, and a model-dependent complexity measure, the \emph{Variance Local Curvature} (VLC\mathrm{VLC}), which captures how much information a local change of variance creates inside the hypothesis class. For smooth classes, the VLC\mathrm{VLC} is a reparametrization of a variance--Fisher information, with closed-form values for common families. Benchmarking against the strongest available upper bound shows near-optimality up to logarithmic factors in broad regimes, and pinpoints a systematic gap in highly heterogeneous instances. Our proof introduces two key ingredients: a loss-induced β„“1\ell_1 geometry on the decision space, and a representation-based instance generator that reduces hard-instance construction to an explicit random matrix calculation.


Source: arXiv:2606.14690v1 - http://arxiv.org/abs/2606.14690v1 PDF: https://arxiv.org/pdf/2606.14690v1 Original Link: http://arxiv.org/abs/2606.14690v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jun 15, 2026
Topic:
Data Science
Area:
Machine Learning
Comments:
0
Bookmark
A Complexity Measure for Active Learning in Multi-group Mean Estimation | Researchia