Acceleration for Polyak-Ćojasiewicz Functions with a Gradient Aiming Condition
Abstract
It is known that when minimizing smooth Polyak-Ćojasiewicz (PL) functions, momentum algorithms cannot significantly improve the convergence bound of gradient descent, contrasting with the acceleration phenomenon occurring in the strongly convex case. To bridge this gap, the literature has proposed strongly quasar-convex functions as an intermediate non-convex class, for which accelerated bounds have been suggested to persist. We show that this is not true in general: the additional structure of strong quasar-convexity does not suffice to guaranty better worst-case bounds for momentum compared to gradient descent. As an alternative, we study PL functions under an aiming condition that measures how well the descent direction points toward a minimizer. This perspective clarifies the geometric ingredient enabling provable acceleration by momentum when minimizing PL functions.
Source: arXiv:2602.10022v1 - http://arxiv.org/abs/2602.10022v1 PDF: https://arxiv.org/pdf/2602.10022v1 Original Link: http://arxiv.org/abs/2602.10022v1