Cloudflare · 2026-04-17 · notable
Cloudflare Agent Memory — Managed Persistent Memory Service for AI Agents (Private Beta)
Cloudflare Agent Memory gives Workers-based AI agents persistent cross-session memory. Retrieval uses five parallel channels (full-text, exact lookup, raw message, direct vector, HyDE embeddings) fused with Reciprocal Rank Fusion — no external memory store needed.

Managed persistent memory for Cloudflare AI agents — agents can remember facts, events, instructions, and tasks across sessions without bloating the context window.
What is it?
Agent Memory is a Cloudflare-managed service that gives AI agents built on Cloudflare Workers a persistent memory layer that survives across sessions. Rather than stuffing everything into the context window, agents offload important information to Agent Memory and retrieve what's relevant on demand. It's currently in private beta as part of Cloudflare Agents Week.
How does it work?
Memories are classified into four types: Facts, Events, Instructions, and Tasks. Ingestion runs a multi-stage pipeline that extracts, verifies, classifies, and stores memories when the agent compacts its context. Retrieval runs five parallel channels simultaneously — full-text search, exact key lookup, raw message search, direct vector embeddings, and HyDE (Hypothetical Document Embeddings) — then fuses results with Reciprocal Rank Fusion to surface the most relevant memories. The service is built on Durable Objects (isolation), Vectorize (semantic search), and Workers AI (model inference). Agents can also call memory tools directly: `ingest`, `remember`, `recall`, `list`, and `forget`.
Why does it matter?
Context window degradation is one of the core unsolved problems in long-running AI agents — as the context fills up, important early information gets compressed away or lost. Agent Memory solves this by letting agents externalize memory and retrieve it selectively. The five-channel retrieval with RRF fusion is more robust than a single vector store approach. Critically, it runs on the same Workers infrastructure as the agent, so there's no extra service to manage or cross-region latency.
Who is it for?
Developers building long-running or multi-session AI agents on Cloudflare Workers.
Try it
https://developers.cloudflare.com/agents/concepts/memory/