ExplorerComputational LinguisticsNLP
Research PaperResearchia:202605.13016

MEME: Multi-entity & Evolving Memory Evaluation

Seokwon Jung

Abstract

LLM-based agents increasingly operate in persistent environments where they must store, update, and reason over information across many sessions. While prior benchmarks evaluate only single-entity updates, MEME defines six tasks spanning the full space defined by the multi-entity and evolving axes, including three not scored by prior work: Cascade and Absence (dependency reasoning) and Deletion (post-removal state). Evaluating six memory systems spanning three memory paradigms on 100 controlled ...

Submitted: May 13, 2026Subjects: NLP; Computational Linguistics

Description / Details

LLM-based agents increasingly operate in persistent environments where they must store, update, and reason over information across many sessions. While prior benchmarks evaluate only single-entity updates, MEME defines six tasks spanning the full space defined by the multi-entity and evolving axes, including three not scored by prior work: Cascade and Absence (dependency reasoning) and Deletion (post-removal state). Evaluating six memory systems spanning three memory paradigms on 100 controlled episodes, we find that all systems collapse on dependency reasoning under the default configuration (Cascade: 3%, Absence: 1% in average accuracy) despite adequate static retrieval performance. Prompt optimization, deeper retrieval, reduced filler noise, and most stronger LLMs fail to close this gap. Only a file-based agent paired with Claude Opus 4.7 as its internal LLM partially closes the gap, but at ~70x the baseline cost, indicating closure currently depends on configurations that are not practical at scale. Code and data are available on the project page: https://seokwonjung-jay.github.io/meme-eval/.


Source: arXiv:2605.12477v1 - http://arxiv.org/abs/2605.12477v1 PDF: https://arxiv.org/pdf/2605.12477v1 Original Link: http://arxiv.org/abs/2605.12477v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
May 13, 2026
Topic:
Computational Linguistics
Area:
NLP
Comments:
0
Bookmark
MEME: Multi-entity & Evolving Memory Evaluation | Researchia