How do I add persistent memory to n8n AI workflows?

Use soul.py as a Python node in n8n. Install with pip install soul-agent, then use HybridAgent in your Code node to automatically persist memory between workflow executions.

Are n8n AI nodes stateless?

Yes, by default. Each n8n workflow execution starts fresh with no memory of previous conversations. soul.py fixes this by persisting memory to markdown files.

Can soul.py work with n8n self-hosted?

Yes. soul.py stores memory in local markdown files, so it works perfectly with self-hosted n8n instances without any cloud dependencies.

What version of soul.py should I use with n8n?

For simple prototyping, v0.1 (file-based) works great. For production with large memory, use v2.0 with HybridAgent for automatic RAG+RLM routing.

How does soul.py persist memory in n8n?

soul.py writes all interactions to MEMORY.md in your working directory. This file persists between n8n executions, so your agent remembers everything.

Can I edit soul.py memories manually in n8n setups?

Yes. Memory is stored in plain MEMORY.md files — open them in any text editor to read, edit, or delete specific memories.

Does soul.py work with n8n Cloud?

With limitations. n8n Cloud has ephemeral file storage, so you'd need to configure soul.py to use an external persistence layer or SoulMate API.

Adding Persistent Memory to n8n AI Workflows with soul.py

By Prahlad Menon Published 2026-03-01 1 min read

Someone asked how to integrate soul.py into an n8n pipeline. The short answer: it works beautifully as a drop-in Python node.

Quick Start: Try It Locally First

Before wiring into n8n, test it from the terminal:

pip install soul-agent
soul init  # creates SOUL.md and MEMORY.md

from hybrid_agent import HybridAgent

agent = HybridAgent()

while True:
    q = input("You: ")
    result = agent.ask(q)
    print(f"Agent: {result['answer']}\n")

Memory persists automatically between runs — kill the script, restart it, and the agent picks up exactly where it left off. Everything’s stored in plain MEMORY.md in your working directory, so you can read or edit it with any text editor.

Once you’ve confirmed it works, wire it into n8n.

The Problem

n8n’s AI nodes are stateless by default. Each workflow execution starts fresh — your agent has no memory of previous conversations. For simple automations, that’s fine. For anything resembling a persistent assistant, it’s a dealbreaker.

The Solution: HybridAgent

soul.py’s HybridAgent automatically picks the right retrieval strategy per query:

RAG (~90% of queries): Fast semantic search for specific lookups
RLM (~10% of queries): Recursive synthesis for questions like “summarize everything we’ve discussed”

You don’t configure anything — the router decides.

n8n Integration

Create a Python wrapper script on your n8n server:

# soul_node.py
import sys, json
from hybrid_agent import HybridAgent

agent = HybridAgent()  # auto-detects RAG vs RLM per query
query = sys.argv[1]
result = agent.ask(query)

print(json.dumps({
    "response": result["answer"],
    "route": result["route"],  # "RAG" or "RLM"
}))

In your n8n workflow, use an Execute Command node:

python /path/to/soul_node.py "{{ $json.message }}"

The agent:

Reads SOUL.md for identity and MEMORY.md for context
Routes to RAG or RLM based on query type
Responds with the answer and which route it took
Appends the exchange to memory

Next workflow execution? It remembers everything.

Forcing a Mode

For high-volume pipelines where you want consistent latency, you can force a specific mode:

agent = HybridAgent(mode="rag")   # always RAG (faster)
agent = HybridAgent(mode="rlm")   # always RLM (exhaustive)
agent = HybridAgent(mode="auto")  # router decides (default)

Vector Store Setup (Optional)

HybridAgent works best with Qdrant + Azure embeddings for semantic search:

agent = HybridAgent(
    qdrant_url=os.environ.get("QDRANT_URL"),
    qdrant_api_key=os.environ.get("QDRANT_API_KEY"),
    azure_embedding_endpoint=os.environ.get("AZURE_EMBEDDING_ENDPOINT"),
    azure_embedding_key=os.environ.get("AZURE_EMBEDDING_KEY"),
)

No Qdrant? It automatically falls back to BM25 (keyword-based retrieval). Not as good as vector search, but still works.

n8n Cloud (No Filesystem)

If you’re on n8n Cloud without persistent filesystem access, store MEMORY.md contents in a database or n8n variable:

# soul_cloud_node.py
import sys, json

memory_content = sys.argv[1]
query = sys.argv[2]

# Write memory to temp file
with open("/tmp/MEMORY.md", "w") as f:
    f.write(memory_content)

from hybrid_agent import HybridAgent
agent = HybridAgent(memory_path="/tmp/MEMORY.md")
result = agent.ask(query)

# Return both response and updated memory
updated_memory = open("/tmp/MEMORY.md").read()
print(json.dumps({
    "response": result["answer"],
    "route": result["route"],
    "memory": updated_memory
}))

Then use n8n’s Set node to persist memory back to your storage after each call.

Lite Option: Simple Agent (v0.1)

For prototyping or when your memory will stay small (<1500 tokens), the simple Agent class skips the routing layer entirely:

from soul import Agent

agent = Agent(provider="anthropic")  # or "openai"
result = agent.ask(query)

This injects the full MEMORY.md into the system prompt. Zero infrastructure, zero configuration. Great for learning, but won’t scale to large memory files.

Which Should You Use?

Use Case	Recommendation
Production workflows	`HybridAgent` (auto mode)
High-volume pipelines	`HybridAgent(mode="rag")`
Prototyping / learning	`Agent` (simple mode)
Large memory files	`HybridAgent` with Qdrant

Both use the same SOUL.md and MEMORY.md format — upgrade from simple to hybrid without changing your data.

Try It

pip install soul-agent
soul init

Live demos:

soulv2.themenonlab.com — HybridAgent with RAG+RLM routing
soul.themenonlab.com — Simple Agent

GitHub: github.com/menonpg/soul.py

Update: Want the fastest path to n8n + soul.py? Try soul-stack — a single Docker command that spins up n8n, soul.py, and Jupyter pre-configured and ready to go.

Your AI workflows deserve memory. soul.py gives them one.