Robust and Computationally Efficient Linear Contextual Bandits under Adversarial Corruption and Heavy-Tailed Noise
Abstract
We study linear contextual bandits under adversarial corruption and heavy-tailed noise with finite -th moments for some . Existing work that addresses both adversarial corruption and heavy-tailed noise relies on a finite variance (i.e., finite second-moment) assumption and suffers from computational inefficiency. We propose a computationally efficient algorithm based on online mirror descent that achieves robustness to both adversarial corruption and heavy-tailed noise. While the existing algorithm incurs computational cost, our algorithm reduces this to per round. We establish an additive regret bound consisting of a term depending on the -moment bound of the noise and a term depending on the total amount of corruption. In particular, when , our result recovers existing guarantees under finite-variance assumptions. When no corruption is present, it matches the best-known rates for linear contextual bandits with heavy-tailed noise. Moreover, the algorithm requires no prior knowledge of the noise moment bound or the total amount of corruption and still guarantees sublinear regret.
Source: arXiv:2603.15596v1 - http://arxiv.org/abs/2603.15596v1 PDF: https://arxiv.org/pdf/2603.15596v1 Original Link: http://arxiv.org/abs/2603.15596v1