ExplorerData ScienceMachine Learning
Research PaperResearchia:202605.07016

Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

Nicholas Barnfield

Abstract

How many key-value associations can a $d\times d$ linear memory store? We show that the answer depends not only on the $d^2$ degrees of freedom in the memory matrix, but also on the retrieval criterion. In an isotropic Gaussian model for the stored pairs, we show that top-1 retrieval, where every signal must beat its largest distractor, requires the logarithmic model-size scale $d^2\asymp n\log n$. We prove that the correlation matrix memory construction, which stores associations by superposing...

Submitted: May 7, 2026Subjects: Machine Learning; Data Science

Description / Details

How many key-value associations can a d×dd\times d linear memory store? We show that the answer depends not only on the d2d^2 degrees of freedom in the memory matrix, but also on the retrieval criterion. In an isotropic Gaussian model for the stored pairs, we show that top-1 retrieval, where every signal must beat its largest distractor, requires the logarithmic model-size scale d2nlognd^2\asymp n\log n. We prove that the correlation matrix memory construction, which stores associations by superposing key-target outer products, achieves this scale through a sharp phase transition, and that the same scaling is necessary for any linear memory. Thus the logarithm is the intrinsic extreme-value price of winner-take-all decoding. We next consider listwise retrieval, where the correct target need not be the unique top-scoring item but should remain among the strongest candidates. To formalize this regime, we propose the Tail-Average Margin (TAM), a convex upper-tail criterion that certifies inclusion of the correct target in a controlled candidate list. Under this listwise retrieval criterion, the capacity follows the quadratic scale d2nd^2\asymp n. At load n/d2αn/d^2\toα, we develop an exact asymptotic theory for the TAM empirical-risk minimizer through a two-parameter scalar variational principle. The theory has a rich phenomenology: in the ridgeless limit it yields a closed-form critical load separating satisfiable and unsatisfiable phases, and it predicts the limiting laws of true scores, competitor scores, margins, and percentile profiles. Finally, a small-tail extrapolation further leads to the conjectural sharp top-1 threshold d22nlognd^2\sim 2n\log n.


Source: arXiv:2605.05189v1 - http://arxiv.org/abs/2605.05189v1 PDF: https://arxiv.org/pdf/2605.05189v1 Original Link: http://arxiv.org/abs/2605.05189v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 7, 2026
Topic:
Data Science
Area:
Machine Learning
Comments:
0
Bookmark
Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval | Researchia