mem0 on X: “如何在Codex CLI中处理记忆” / X

TL;DR · AI 摘要
文章详细介绍了Codex CLI中的记忆机制及其工作流程。
核心要点
- Codex CLI 使用固定markdown文件存储记忆,不使用SQLite或索引。
- 记忆分为两阶段处理:第一阶段在会话空闲六小时后进行,第二阶段全局锁定。
- 记忆摘要在会话开始时预加载,后续查询通过直接搜索MEMORY.md实现。
结构提纲
按章节快速跳转。
思维导图
用一张图看清主题之间的关系。
查看大纲文本(无障碍 / 无 JS 友好)
- Codex CLI 记忆机制
- 记忆存储
- 记忆处理流程
金句 / Highlights
值得收藏与分享的关键句。
记忆存储在固定markdown文件中,不使用SQLite或索引。
Article

How Memory Works in Codex CLI
Codex CLI ships a generated, async-summarized memory store at ~/.codex/memories/
It's a first-party feature:
- documented at
- and fully open-sourced at
.
In this In Context article, we walk through how memory is handled in Codex CLI.
Codex is OpenAI's coding agent and it ships in three variants: a CLI, an IDE extension, and a cloud workspace inside ChatGPT.
The CLI is the most heavily documented surface and its memory internals are open-sourced so that's the one we look at here. Most of what follows applies transitively to the IDE extension, which shares the same ~/.codex/ config and memory layout.
Codex's memory lives in one directory: ~/.codex/memories/. The directory holds a fixed set of markdown files. No SQLite, no embedding index or opaque blobs.
The files that matter:
- memory_summary.md: the consolidated view the next session reads first
- MEMORY.md: the long-form merged memory file, grepped on demand
- raw_memories.md: pre-consolidation extraction output
- skills/<name>/SKILL.md: skill-specific memories
- rollout_summaries/<slug>.md: per-session summaries that feed the consolidation pass
That's the whole memory layer. Everything else is the pipeline that fills it and the read path that pulls from it.
The write path splits into two phases.
Phase 1 is per-rollout. When a session has been idle long enough (six hours by default), Codex claims a startup job, samples the conversation against a strict-schema extraction prompt, runs every output through a secret-redactor, and stores the result in a local state database. Nothing reaches the memory directory yet.
Phase 2 is global. It acquires a global lock (held in the state DB) so two consolidation passes never run in parallel. It loads recent Phase 1 outputs, syncs them to disk, checks whether the resulting workspace actually differs from what's there, and only then spawns a separate consolidation sub-agent. The sub-agent reads the candidates, decides what to merge, what to patch, what to drop, and writes the diff back to ~/.codex/memories/.
Both phases use configurable models (extract_model for Phase 1, consolidation_model for Phase 2), and the consolidation pass runs as a separate sub-agent rather than inline with the user's chat.
At session start, Codex reads memory_summary.md whole, truncates it to fit a fixed 5,000-token budget (a constant defined in the read-path crate, not in the public docs), and injects the result into the developer instructions. That truncated summary is the entirety of what the agent gets pre-loaded.
For anything not in the summary, the agent is instructed to grep MEMORY.md directly. The read-path template tells the agent: extract keywords from the summary, search MEMORY.md, open relevant rollout-summary files if pointed to, stop if nothing matches. The whole search budget is kept under a small number of steps.
There is no embedding store, no similarity search, no rerank step. It's plain text and substring matching, by design. The trade is predictability and zero retrieval-time cost in exchange for not finding facts whose query and stored phrasing don't share keywords.
A handful of configuration knobs control everything above. Worth knowing before relying on the system.
- Off by default. A master features.memories flag has to be turned on. Once on, two sub-switches govern the runtime: generate_memories (write) and use_memories (read). Both default to true after the feature is enabled.
- Idle threshold. min_rollout_idle_hours (default 6) sets the minimum idle time before a session is eligible for Phase 1. Active sessions never trigger it.
- Consolidation caps. max_raw_memories_for_consolidation (default 256, capped at 4,096) bounds how many recent rollouts the consolidation pass considers per run.
- Aging. max_rollout_age_days (default 30) stops considering threads older than that for memory generation. max_unused_days (default 30) makes memories that haven't been recalled in that window ineligible for the next consolidation pass.
- Rate-limit-aware. min_rate_limit_remaining_percent (default 25) backs off consolidation when the user's API quota is low, so background work never starves foreground requests.
- Secret redaction. Built-in scrubbing for credentials before any memory hits disk.
- Geographic limit. At launch, Memories is not available in the EEA, UK, or Switzerland. Users in those regions don't have access at all.
The shortest framing: this memory system is not robust. It's a single user's markdown files inside a fixed token budget, and most of the failure modes follow from that.
- Hard token cap on what's loaded. memory_summary.md is truncated to 5,000 tokens at every session start. Anything past the cap is silently dropped. No warning, no log. The agent simply doesn't see it.
- Keyword-only fallback. Anything that didn't make it into the summary has to be findable in MEMORY.md by literal substring match. Paraphrased facts are invisible. "What's our deploy command?" will not find "production deploys go through make ship-prod" unless one of those exact terms is stored.
- Grep cost grows with the file. MEMORY.md is searched linearly. For a long-lived user with months of accumulated memories, every miss-then-grep costs more.
- Generated state only, not editable. The docs frame ~/.codex/memories/ as Codex-managed. Hand-editing memories is not the supported path. Anything the user actually wants the agent to always know goes in AGENTS.md instead.
- Idle-gate fragility. Sessions need to sit idle six hours before they become eligible for consolidation. A developer working in back-to-back coding sprints may never trigger it.
- Local-only state. ~/.codex/memories/ is per-user, per-machine filesystem state. No cross-machine sync, no team sharing, no remote backing store.
- Geographic gate. Memories is blocked in EEA, UK, and Switzerland at launch.
Production scenarios that hit these limits
The structural limits matter most in concrete cases that show up in real teams.
Mem0 is a memory layer that sits between the agent and a persistent store. For Codex specifically, there is a MCP integration documented at
.
Codex CLI supports MCP servers natively. They're configured in ~/.codex/config.toml under [mcp_servers.<name>] (
). The minimal Mem0 setup is one block:
[mcp_servers.mem0]
url = "https://mcp.mem0.ai/mcp"
bearer_token_env_var = "MEM0_API_KEY"That single block exposes nine MCP tools to Codex: add_memory, search_memories, get_memories, get_memory, update_memory, delete_memory, delete_all_memories, delete_entities, list_entities.
Codex's agent picks them up the same way it picks up any other MCP tool surface.
What changes underneath:
- Persistent, cross-machine memory. The local ~/.codex/memories/ ceiling is gone. The same Mem0 store follows the user across laptops, servers, and CI environments. No regeneration from scratch on a new machine.
- Cross-tool memory. A user running Codex CLI for terminal work and Cursor for editor work can point both at the same Mem0 backend. A fact captured during a Codex CLI session ("the staging deploy script swallows errors silently, prefer the production-style output for the same diagnostic") surfaces inside the next Cursor Agent run on the same project. The user doesn't have to copy anything between the two tools.
- Semantic recall. Mem0 retrieves by meaning, via an embedding-backed search. Codex's native recall reads memory_summary.md whole and falls back to grep over MEMORY.md: fast and predictable, but unable to find a stored fact whose phrasing differs from the query. With Mem0, asking "what's our deploy command?" finds "production deploys go through make ship-prod" even though the words don't overlap.
- Per-user isolation. Each user's userId scopes their memory. Three teammates with the same Mem0 backend keep separate stores.
- No 5,000-token summary cap. Retrieval is ranked semantic search over the actual store, not a whole-file load of a single summary file. Large memory bases stay usable instead of being truncated to fit the developer-instructions budget.
- Real-time updates. Mem0 writes memories as the conversation happens, not on a six-hour idle gate. The next session sees what the last one taught it.
- Available where Memories isn't. EEA, UK, and Switzerland users get a memory layer that doesn't depend on Codex's regional rollout.
Codex CLI's memory layer is carefully designed in the coding agent space today: two-phase background consolidation, secret redaction, a grep-based read path with predictable token cost, and a pipeline that's open-source end-to-end.
If you're running Codex CLI today and any of those gaps look familiar, the Mem0 integration takes about a minute [
]
In Context #9
This blog is part of In Context, a
blog series covering AI Agent memory and context engineering.
is an intelligent, open-source memory layer designed for LLMs and AI agents to provide long-term, personalized, and context-aware interactions across sessions.
- Get your free API key here:
- or self-host mem0 from our
.