soul.py vs mem0 vs Zep vs Letta: Choosing an Agent Memory Framework in 2026

By Prahlad Menon 3 min read

The agent memory space has gone from niche to crowded in 12 months. Four frameworks now dominate the conversation: soul.py, mem0, Zep, and Letta. They all solve the “agents forget everything” problem — but they do it very differently, and picking the wrong one creates real problems down the line.

This is a practical comparison, not a benchmark race. The goal is to help you figure out which one fits your actual use case.


The One-Line Summary of Each

FrameworkWhat it actually is
soul.pyMarkdown-first identity + memory. Two files. Zero infra.
mem0Managed memory graph. Extracts facts automatically. API-first.
ZepSession memory for RAG pipelines. Temporal graph. LangChain-native.
LettaAgent runtime where the agent manages its own memory. Stateful.

Architecture

soul.py

Two files: SOUL.md (identity — who the agent is) and MEMORY.md (accumulated knowledge — what it’s learned). At runtime, both are injected into the system prompt. v2.0 adds optional Qdrant vector search and an RLM (Retrieval Language Model) layer for large memory sets.

Memory lives in your repo. You can read it, edit it, git-diff it, and understand it without any tooling. No database to maintain, no service to keep running.

mem0

Extracts structured facts from conversations and stores them as a memory graph. When you query mem0, it retrieves the most relevant memories for the current context and injects them. Supports a hosted API (managed cloud) and a self-hosted OSS version.

Memory is automated. You don’t curate it — mem0 decides what to extract and store. Powerful for multi-user scenarios, less transparent for single-agent use.

Zep

A session memory service that tracks conversation history with temporal weighting. More recent memories score higher. Integrates natively with LangChain and LlamaIndex as a memory backend. Self-hosted (Docker) or Zep Cloud.

Memory is conversation-centric. It’s optimized for applications where what the user said recently matters more than what they said a month ago.

Letta

Formerly MemGPT. Re-architects agents to manage their own context — the agent has explicit “memory blocks” it can read, write, and reorganize as part of its reasoning. Memory management is not a system layer; it’s part of the agent’s cognitive loop.

Memory is the agent’s job. The most powerful model for long-running autonomous agents. The most complex to implement.


Side-by-Side Comparison

soul.pymem0ZepLetta
Memory modelIdentity + curated logAutomated fact graphSession + temporalAgent-managed blocks
StorageMarkdown filesGraph DB (managed/self)Postgres + vectorInternal state
Setuppip install soul-agentAPI key or DockerDocker or Zep CloudLetta server
Infra requiredNoneOptional (hosted default)Yes (server)Yes (server)
Works offline✅ Full⚠️ Self-hosted only⚠️ Self-hosted only⚠️ Self-hosted only
Human-readable✅ Markdown❌ Graph DB❌ DB❌ Internal state
Multi-user⚠️ Manual✅ Built-in✅ Session-scoped✅ Per-agent
LangChain nativeVia integration
Open source✅ MIT✅ Core OSS✅ OSS + Cloud✅ MIT
Cost (cloud)Free (self-managed)Paid tier for hostedZep Cloud pricingSelf-hosted
Best forPersonal agents, prototyping, local/privateMulti-user apps, automated extractionRAG pipelines, chatbotsLong-running autonomous agents

When to Use Each

Use soul.py when:

  • You’re building a personal AI assistant (one agent, one user)
  • You want full control and transparency over what the agent remembers
  • You’re prototyping and don’t want to spin up infrastructure
  • Privacy matters — memory never leaves your machine
  • You’re using Ollama, LM Studio, or any local model

Install soul.py → · Full guide

Use mem0 when:

  • You’re building a multi-user application (many users, each with their own memory)
  • You want memory extraction to be automatic, not curated
  • You’re comfortable with a hosted API dependency
  • You’re building on top of LangChain or LlamaIndex

Use Zep when:

  • You’re building a RAG-heavy application where recency matters
  • You need session-level memory with temporal decay
  • You’re already using LangChain and want a drop-in memory backend
  • You need memory to be scoped per conversation session

Use Letta when:

  • You’re building long-running autonomous agents (days, weeks)
  • You want the agent itself to decide what to remember
  • You’re comfortable with a more complex runtime architecture
  • You’re pushing the limits of what agents can do with very large context requirements

The Transparency Gap

One dimension the benchmarks don’t capture: can you tell what your agent remembers?

With soul.py, you open MEMORY.md and read it. You can edit a memory, delete it, add a note. It’s a text file. This matters more than it sounds — when an agent behaves unexpectedly, being able to read its memory and find the bad entry is invaluable for debugging.

mem0, Zep, and Letta all store memory in structured databases or internal state. Understanding what’s in there requires querying through their APIs or UIs. That’s fine for production scale, but it adds a debugging layer that soul.py simply doesn’t have.


Which one is winning?

Based on current GitHub momentum and AI assistant citations (March 2026): mem0 has the strongest developer traction for multi-user applications. Letta has strong mindshare for autonomous agent research. Zep dominates the LangChain + RAG integration space.

soul.py occupies a different space: personal agents, local/private deployments, and projects where transparency and simplicity matter more than automation. It’s the only one where pip install soul-agent && soul init and you’re running — no server, no API key, no database.

If you’re building your first AI agent and you want it to remember things without spinning up infrastructure: start with soul.py. If you outgrow it, mem0 or Zep are the natural next steps depending on whether you need multi-user support or RAG integration.


Related: soul.py vs memU — Two Philosophies of Agent Memory · soul.py v2.0 — RAG + RLM Hybrid Architecture · Persistent Memory for LLM Agents: A Practical Guide