Back to Explorer
Research PaperResearchia:202604.08009[Artificial Intelligence > AI]

PoM: A Linear-Time Replacement for Attention with the Polynomial Mixer

David Picard

Abstract

This paper introduces the Polynomial Mixer (PoM), a novel token mixing mechanism with linear complexity that serves as a drop-in replacement for self-attention. PoM aggregates input tokens into a compact representation through a learned polynomial function, from which each token retrieves contextual information. We prove that PoM satisfies the contextual mapping property, ensuring that transformers equipped with PoM remain universal sequence-to-sequence approximators. We replace standard self-attention with PoM across five diverse domains: text generation, handwritten text recognition, image generation, 3D modeling, and Earth observation. PoM matches the performance of attention-based models while drastically reducing computational cost when working with long sequences. The code is available at https://github.com/davidpicard/pom.


Source: arXiv:2604.06129v1 - http://arxiv.org/abs/2604.06129v1 PDF: https://arxiv.org/pdf/2604.06129v1 Original Link: http://arxiv.org/abs/2604.06129v1

Submission:4/8/2026
Comments:0 comments
Subjects:AI; Artificial Intelligence
Original Source:
View Original PDF
arXiv: This paper is hosted on arXiv, an open-access repository
Was this helpful?

Discussion (0)

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!