Back to Explorer
Research PaperResearchia:202603.05013[Computer Science > Cybersecurity]

Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective

Enea Monzio Compagnoni

Abstract

Differential Privacy (DP) is becoming central to large-scale training as privacy regulations tighten. We revisit how DP noise interacts with adaptivity in optimization through the lens of stochastic differential equations, providing the first SDE-based analysis of private optimizers. Focusing on DP-SGD and DP-SignSGD under per-example clipping, we show a sharp contrast under fixed hyperparameters: DP-SGD converges at a Privacy-Utility Trade-Off of O(1/ε2)\mathcal{O}(1/\varepsilon^2) with speed independent of ε\varepsilon, while DP-SignSGD converges at a speed linear in ε\varepsilon with an O(1/ε)\mathcal{O}(1/\varepsilon) trade-off, dominating in high-privacy or large batch noise regimes. By contrast, under optimal learning rates, both methods achieve comparable theoretical asymptotic performance; however, the optimal learning rate of DP-SGD scales linearly with ε\varepsilon, while that of DP-SignSGD is essentially ε\varepsilon-independent. This makes adaptive methods far more practical, as their hyperparameters transfer across privacy levels with little or no re-tuning. Empirical results confirm our theory across training and test metrics, and empirically extend from DP-SignSGD to DP-Adam.


Source: arXiv:2603.03226v1 - http://arxiv.org/abs/2603.03226v1 PDF: https://arxiv.org/pdf/2603.03226v1 Original Link: http://arxiv.org/abs/2603.03226v1

Submission:3/5/2026
Comments:0 comments
Subjects:Cybersecurity; Computer Science
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!