The Token Is a Group Element: On Lie-Algebra Attention over Matrix Lie Groups
Abstract
We place the attention token on the group: a token is an element $g_i$ of a matrix Lie group $G$ -- a bare transformation, with no feature payload and no external action $Ο(g)$ carrying it. To our knowledge this is the first attention construction whose tokens are bare matrix Lie group elements: their score is the closed-form algebra norm of the relative pose rather than a learned kernel, and it reaches the affine full-frame groups that every irrep- or surjective-exp-based method must exclude. W...
Description / Details
We place the attention token on the group: a token is an element of a matrix Lie group -- a bare transformation, with no feature payload and no external action carrying it. To our knowledge this is the first attention construction whose tokens are bare matrix Lie group elements: their score is the closed-form algebra norm of the relative pose rather than a learned kernel, and it reaches the affine full-frame groups that every irrep- or surjective-exp-based method must exclude. We call it Lie-Algebra Attention. Once tokens are group elements, the rest follows with none of the usual representation-theoretic machinery. The relative geometry of a pair is canonical, , so the pairwise invariant is intrinsic rather than designed; equivariance under the diagonal -action is tautological, and the cocycle condition holds automatically. The attention score is the negative squared algebra norm, : the canonical proximity kernel under a block-weighted Frobenius inner product, with no irreducible representations, spherical harmonics, Clebsch-Gordan products, or learned kernel. The construction applies to any matrix Lie group on a chosen logarithm chart containing the relative poses, including the non-compact non-abelian affine groups with scale and shear that no vector-token attention method reaches: neither the irrep tradition nor surjective-exp methods. Three sequence-completion experiments, on SE(2), SO(3), and Aff(2), bear this out: the closed-form score matches a learned MLP kernel on the same invariant and outperforms it on SE(2), using 50 to 80x fewer score parameters, while a vector-token baseline breaks invariance by five to twelve orders of magnitude.
Source: arXiv:2606.20547v1 - http://arxiv.org/abs/2606.20547v1 PDF: https://arxiv.org/pdf/2606.20547v1 Original Link: http://arxiv.org/abs/2606.20547v1
Please sign in to join the discussion.
No comments yet. Be the first to share your thoughts!
Jun 19, 2026
Robotics
Robotics
0