Exploding and vanishing gradients in deep neural networks: the effect of residual connections
Abstract
The well known phenomenon of exploding and vanishing gradients in deep neural networks is analyzed using multiplicative ergodic theory. The effect of adding a residual connection is explained in this context. Specifically, a characterization of Liapunov exponents due to Furstenberg and Kifer is exploited in order to make a precise statement about the Liapunov spectrum and the effect of residual connections on it. --- Source: arXiv:2606.17013v1 - http://arxiv.org/abs/2606.17013v1 PDF: https://arx...
Description / Details
The well known phenomenon of exploding and vanishing gradients in deep neural networks is analyzed using multiplicative ergodic theory. The effect of adding a residual connection is explained in this context. Specifically, a characterization of Liapunov exponents due to Furstenberg and Kifer is exploited in order to make a precise statement about the Liapunov spectrum and the effect of residual connections on it.
Source: arXiv:2606.17013v1 - http://arxiv.org/abs/2606.17013v1 PDF: https://arxiv.org/pdf/2606.17013v1 Original Link: http://arxiv.org/abs/2606.17013v1
Please sign in to join the discussion.
No comments yet. Be the first to share your thoughts!
Jun 16, 2026
Mathematics
Mathematics
0