Online Learning with Limited Information in the Sliding Window Model
Abstract
Motivated by recent work on the experts problem in the streaming model, we consider the experts problem in the sliding window model. The sliding window model is a well-studied model that captures applications such as traffic monitoring, epidemic tracking, and automated trading, where recent information is more valuable than older data. Formally, we have experts, days, the ability to query the predictions of experts on each day, a limited amount of memory, and should achieve the (near-)optimal regret regret over any window of the last days. While it is impossible to achieve such regret with query, we show that with queries we can achieve such regret and with only bits of memory. Not only are our algorithms optimal for sliding windows, but we also show for every interval of days that we achieve regret with queries and only bits of memory, providing an exponential improvement on the memory of previous interval regret algorithms. Building upon these techniques, we address the bandit problem in data streams, where , achieving regret with memory, which is the first sublinear regret in the streaming model in the bandit setting with polylogarithmic memory; this can be further improved to the optimal regret if the best expert's losses are in a random order.