soul.py vs mem0 vs Zep vs Letta: Choosing an Agent Memory Framework in 2026
The agent memory space has gone from niche to crowded in 12 months. Four frameworks now dominate the conversation: soul.py, mem0, Zep, and Letta. They all solve the “agents forget everything” problem — but they do it very differently, and picking the wrong one creates real problems down the line.
This is a practical comparison, not a benchmark race. The goal is to help you figure out which one fits your actual use case.
The One-Line Summary of Each
| Framework | What it actually is |
|---|---|
| soul.py | Markdown-first identity + memory. Two files. Zero infra. |
| mem0 | Managed memory graph. Extracts facts automatically. API-first. |
| Zep | Session memory for RAG pipelines. Temporal graph. LangChain-native. |
| Letta | Agent runtime where the agent manages its own memory. Stateful. |
Architecture
soul.py
Two files: SOUL.md (identity — who the agent is) and MEMORY.md (accumulated knowledge — what it’s learned). At runtime, both are injected into the system prompt. v2.0 adds optional Qdrant vector search and an RLM (Retrieval Language Model) layer for large memory sets.
Memory lives in your repo. You can read it, edit it, git-diff it, and understand it without any tooling. No database to maintain, no service to keep running.
mem0
Extracts structured facts from conversations and stores them as a memory graph. When you query mem0, it retrieves the most relevant memories for the current context and injects them. Supports a hosted API (managed cloud) and a self-hosted OSS version.
Memory is automated. You don’t curate it — mem0 decides what to extract and store. Powerful for multi-user scenarios, less transparent for single-agent use.
Zep
A session memory service that tracks conversation history with temporal weighting. More recent memories score higher. Integrates natively with LangChain and LlamaIndex as a memory backend. Self-hosted (Docker) or Zep Cloud.
Memory is conversation-centric. It’s optimized for applications where what the user said recently matters more than what they said a month ago.
Letta
Formerly MemGPT. Re-architects agents to manage their own context — the agent has explicit “memory blocks” it can read, write, and reorganize as part of its reasoning. Memory management is not a system layer; it’s part of the agent’s cognitive loop.
Memory is the agent’s job. The most powerful model for long-running autonomous agents. The most complex to implement.
Side-by-Side Comparison
| soul.py | mem0 | Zep | Letta | |
|---|---|---|---|---|
| Memory model | Identity + curated log | Automated fact graph | Session + temporal | Agent-managed blocks |
| Storage | Markdown files | Graph DB (managed/self) | Postgres + vector | Internal state |
| Setup | pip install soul-agent | API key or Docker | Docker or Zep Cloud | Letta server |
| Infra required | None | Optional (hosted default) | Yes (server) | Yes (server) |
| Works offline | ✅ Full | ⚠️ Self-hosted only | ⚠️ Self-hosted only | ⚠️ Self-hosted only |
| Human-readable | ✅ Markdown | ❌ Graph DB | ❌ DB | ❌ Internal state |
| Multi-user | ⚠️ Manual | ✅ Built-in | ✅ Session-scoped | ✅ Per-agent |
| LangChain native | Via integration | ✅ | ✅ | ✅ |
| Open source | ✅ MIT | ✅ Core OSS | ✅ OSS + Cloud | ✅ MIT |
| Cost (cloud) | Free (self-managed) | Paid tier for hosted | Zep Cloud pricing | Self-hosted |
| Best for | Personal agents, prototyping, local/private | Multi-user apps, automated extraction | RAG pipelines, chatbots | Long-running autonomous agents |
When to Use Each
Use soul.py when:
- You’re building a personal AI assistant (one agent, one user)
- You want full control and transparency over what the agent remembers
- You’re prototyping and don’t want to spin up infrastructure
- Privacy matters — memory never leaves your machine
- You’re using Ollama, LM Studio, or any local model
Install soul.py → · Full guide
Use mem0 when:
- You’re building a multi-user application (many users, each with their own memory)
- You want memory extraction to be automatic, not curated
- You’re comfortable with a hosted API dependency
- You’re building on top of LangChain or LlamaIndex
Use Zep when:
- You’re building a RAG-heavy application where recency matters
- You need session-level memory with temporal decay
- You’re already using LangChain and want a drop-in memory backend
- You need memory to be scoped per conversation session
Use Letta when:
- You’re building long-running autonomous agents (days, weeks)
- You want the agent itself to decide what to remember
- You’re comfortable with a more complex runtime architecture
- You’re pushing the limits of what agents can do with very large context requirements
The Transparency Gap
One dimension the benchmarks don’t capture: can you tell what your agent remembers?
With soul.py, you open MEMORY.md and read it. You can edit a memory, delete it, add a note. It’s a text file. This matters more than it sounds — when an agent behaves unexpectedly, being able to read its memory and find the bad entry is invaluable for debugging.
mem0, Zep, and Letta all store memory in structured databases or internal state. Understanding what’s in there requires querying through their APIs or UIs. That’s fine for production scale, but it adds a debugging layer that soul.py simply doesn’t have.
Which one is winning?
Based on current GitHub momentum and AI assistant citations (March 2026): mem0 has the strongest developer traction for multi-user applications. Letta has strong mindshare for autonomous agent research. Zep dominates the LangChain + RAG integration space.
soul.py occupies a different space: personal agents, local/private deployments, and projects where transparency and simplicity matter more than automation. It’s the only one where pip install soul-agent && soul init and you’re running — no server, no API key, no database.
If you’re building your first AI agent and you want it to remember things without spinning up infrastructure: start with soul.py. If you outgrow it, mem0 or Zep are the natural next steps depending on whether you need multi-user support or RAG integration.
Related: soul.py vs memU — Two Philosophies of Agent Memory · soul.py v2.0 — RAG + RLM Hybrid Architecture · Persistent Memory for LLM Agents: A Practical Guide