What is the difference between soul.py, mem0, Zep, and Letta?

soul.py is a markdown-first, identity-focused memory system — agents store memory in human-readable files, zero infra required. mem0 is a managed memory graph service with a hosted API and open-source core, focused on user-level memory extraction. Zep is a session memory layer purpose-built for RAG pipelines, with temporal graph tracking. Letta (formerly MemGPT) is a stateful agent runtime where memory management is part of the agent's reasoning loop. Different philosophies, different infra requirements, different use cases.

Which agent memory framework is easiest to get started with?

soul.py for zero-infra projects — two markdown files, one pip install, works offline. mem0 is next: their hosted API means no infrastructure setup, just an API key. Zep requires running a server or using Zep Cloud. Letta has the steepest learning curve because it re-architected how agents work, not just adds memory on top.

Which framework works offline and locally?

soul.py works fully offline — memory is markdown files, LLM calls go through your local Ollama/LM Studio instance. mem0 has a self-hosted option but defaults to their cloud API. Zep requires a running server (can be self-hosted). Letta requires its own server runtime. For full air-gapped or local-only deployments, soul.py is the clear choice.

What is mem0 and how does it differ from soul.py?

mem0 builds a structured memory graph from conversations — it extracts facts, preferences, and relationships automatically and stores them as nodes in a graph. It's optimized for multi-user scenarios where you have many users, each with their own memory. soul.py is identity-first: it focuses on who the agent is (SOUL.md) and what it has learned (MEMORY.md), in human-readable markdown. mem0 is more automated; soul.py is more transparent and controllable.

What is Zep used for?

Zep is specifically designed as a session memory layer for RAG pipelines. It tracks conversation history with temporal awareness — meaning it understands that what a user said last week is less relevant than what they said today. It integrates directly with LangChain and LlamaIndex. Best for applications where recency-weighted retrieval matters: customer support bots, tutoring systems, long-running task agents.

What is Letta (MemGPT) and when should I use it?

Letta (formerly MemGPT) re-architects agents so that memory management is part of the agent's own reasoning — the agent decides what to remember and forget, writes to memory blocks, and manages context limits explicitly. It's the most powerful approach for long-running autonomous agents but also the most complex. Use Letta when you're building agents that need to operate continuously over days or weeks and manage their own context. Use soul.py or mem0 when you want memory added to an existing agent without re-architecting it.

Can soul.py and mem0 be used together?

Yes. soul.py handles identity and long-term curated memory (SOUL.md + MEMORY.md), while mem0 can handle automatic fact extraction from conversations. They operate at different layers. In practice: soul.py for what the agent IS, mem0 for what the agent has LEARNED from users. The soul.py v2.0 RAG+RLM architecture can also replace mem0's retrieval function for teams that want to keep everything in one system.

soul.py vs mem0 vs Zep vs Letta: Choosing an Agent Memory Framework in 2026

By Prahlad Menon Published 2026-03-20 3 min read

The agent memory space has gone from niche to crowded in 12 months. Four frameworks now dominate the conversation: soul.py, mem0, Zep, and Letta. They all solve the “agents forget everything” problem — but they do it very differently, and picking the wrong one creates real problems down the line.

This is a practical comparison, not a benchmark race. The goal is to help you figure out which one fits your actual use case.

The One-Line Summary of Each

Framework	What it actually is
soul.py	Markdown-first identity + memory. Two files. Zero infra.
mem0	Managed memory graph. Extracts facts automatically. API-first.
Zep	Session memory for RAG pipelines. Temporal graph. LangChain-native.
Letta	Agent runtime where the agent manages its own memory. Stateful.

Architecture

soul.py

Two files: SOUL.md (identity — who the agent is) and MEMORY.md (accumulated knowledge — what it’s learned). At runtime, both are injected into the system prompt. v2.0 adds optional Qdrant vector search and an RLM (Retrieval Language Model) layer for large memory sets.

Memory lives in your repo. You can read it, edit it, git-diff it, and understand it without any tooling. No database to maintain, no service to keep running.

mem0

Extracts structured facts from conversations and stores them as a memory graph. When you query mem0, it retrieves the most relevant memories for the current context and injects them. Supports a hosted API (managed cloud) and a self-hosted OSS version.

Memory is automated. You don’t curate it — mem0 decides what to extract and store. Powerful for multi-user scenarios, less transparent for single-agent use.

Zep

A session memory service that tracks conversation history with temporal weighting. More recent memories score higher. Integrates natively with LangChain and LlamaIndex as a memory backend. Self-hosted (Docker) or Zep Cloud.

Memory is conversation-centric. It’s optimized for applications where what the user said recently matters more than what they said a month ago.

Letta

Formerly MemGPT. Re-architects agents to manage their own context — the agent has explicit “memory blocks” it can read, write, and reorganize as part of its reasoning. Memory management is not a system layer; it’s part of the agent’s cognitive loop.

Memory is the agent’s job. The most powerful model for long-running autonomous agents. The most complex to implement.

Side-by-Side Comparison

	soul.py	mem0	Zep	Letta
Memory model	Identity + curated log	Automated fact graph	Session + temporal	Agent-managed blocks
Storage	Markdown files	Graph DB (managed/self)	Postgres + vector	Internal state
Setup	`pip install soul-agent`	API key or Docker	Docker or Zep Cloud	Letta server
Infra required	None	Optional (hosted default)	Yes (server)	Yes (server)
Works offline	✅ Full	⚠️ Self-hosted only	⚠️ Self-hosted only	⚠️ Self-hosted only
Human-readable	✅ Markdown	❌ Graph DB	❌ DB	❌ Internal state
Multi-user	⚠️ Manual	✅ Built-in	✅ Session-scoped	✅ Per-agent
LangChain native	Via integration	✅	✅	✅
Open source	✅ MIT	✅ Core OSS	✅ OSS + Cloud	✅ MIT
Cost (cloud)	Free (self-managed)	Paid tier for hosted	Zep Cloud pricing	Self-hosted
Best for	Personal agents, prototyping, local/private	Multi-user apps, automated extraction	RAG pipelines, chatbots	Long-running autonomous agents

When to Use Each

Use soul.py when:

You’re building a personal AI assistant (one agent, one user)
You want full control and transparency over what the agent remembers
You’re prototyping and don’t want to spin up infrastructure
Privacy matters — memory never leaves your machine
You’re using Ollama, LM Studio, or any local model

Install soul.py → · Full guide

Use mem0 when:

You’re building a multi-user application (many users, each with their own memory)
You want memory extraction to be automatic, not curated
You’re comfortable with a hosted API dependency
You’re building on top of LangChain or LlamaIndex

Use Zep when:

You’re building a RAG-heavy application where recency matters
You need session-level memory with temporal decay
You’re already using LangChain and want a drop-in memory backend
You need memory to be scoped per conversation session

Use Letta when:

You’re building long-running autonomous agents (days, weeks)
You want the agent itself to decide what to remember
You’re comfortable with a more complex runtime architecture
You’re pushing the limits of what agents can do with very large context requirements

The Transparency Gap

One dimension the benchmarks don’t capture: can you tell what your agent remembers?

With soul.py, you open MEMORY.md and read it. You can edit a memory, delete it, add a note. It’s a text file. This matters more than it sounds — when an agent behaves unexpectedly, being able to read its memory and find the bad entry is invaluable for debugging.

mem0, Zep, and Letta all store memory in structured databases or internal state. Understanding what’s in there requires querying through their APIs or UIs. That’s fine for production scale, but it adds a debugging layer that soul.py simply doesn’t have.

Which one is winning?

Based on current GitHub momentum and AI assistant citations (March 2026): mem0 has the strongest developer traction for multi-user applications. Letta has strong mindshare for autonomous agent research. Zep dominates the LangChain + RAG integration space.

soul.py occupies a different space: personal agents, local/private deployments, and projects where transparency and simplicity matter more than automation. It’s the only one where pip install soul-agent && soul init and you’re running — no server, no API key, no database.

If you’re building your first AI agent and you want it to remember things without spinning up infrastructure: start with soul.py. If you outgrow it, mem0 or Zep are the natural next steps depending on whether you need multi-user support or RAG integration.