Agent Memory

The collective set of information-persistence mechanisms available to an AI agent—spanning in-context working memory, episodic long-term memory, and semantic knowledge stores—that maintain continuity across turns and sessions.

Agent memory is typically stratified into three layers. Working memory is the current context window—fast, fully accessible, but strictly bounded. Episodic memory stores past interactions and task histories in an external database, retrievable by time or semantic similarity. Semantic memory stores factual knowledge, embeddings of documents, and summarized insights, searchable by content.

Write policies determine what gets stored and when. Naively writing everything leads to noisy retrieval; writing nothing loses important context. Effective agents use importance scoring, event triggers (task completion, error occurrence, novel information), and periodic summarization to maintain a compact, high-signal memory store.

Read policies determine what gets recalled and injected into context for each new task. Hybrid retrieval (dense semantic search + BM25 keyword) with cross-encoder re-ranking achieves good precision. The amount recalled is constrained by available context budget—agents should prioritize recent, highly relevant memories and discard low-relevance background.