DeerFlow: ByteDance's Open-Source SuperAgent — What It Is, What It Costs, How to Run It
Most “AI agent” tools are chatbots with a search button. DeerFlow is something different — and understanding why requires looking at one design decision that everything else follows from.
DeerFlow gives the AI its own computer.
Not figuratively. A real isolated Docker container with a full filesystem. The agent reads and writes files, executes bash commands, runs Python scripts, and the results persist across steps. When you ask it to build a website, it doesn’t generate code for you to copy. It builds the website, inside its sandbox, and delivers the finished file.
That’s the core idea. Everything else — the sub-agents, the memory, the skills system — is built on top of that foundation.
What DeerFlow Actually Is
DeerFlow (Deep Exploration and Efficient Research Flow) started as a deep research tool from ByteDance. The community took it further than intended — data pipelines, automated content workflows, full application scaffolding — and ByteDance concluded they’d accidentally built a general-purpose agent harness. So they scrapped the original codebase entirely.
DeerFlow 2.0 launched February 28, 2026. It hit #1 on GitHub Trending the same day and now sits at 27,900+ stars and 3,300+ forks. MIT licensed. Built on LangGraph and LangChain. The original v1 research framework still lives on the 1.x branch if you need it.
What It Can Do
A lead agent takes your task, breaks it into sub-tasks, and spawns parallel sub-agents to handle each one. Results converge. You get a finished deliverable.
Out of the box:
- Deep web research with cited sources and structured reports
- Code generation and execution inside the sandbox
- Full website and web app scaffolding
- Slide deck creation from scratch
- Image and video generation
- Python script execution
- Multi-agent parallelization for complex tasks
- Long-term memory across sessions
Example: “Research the top 10 AI startups in 2026 and build me a presentation.”
DeerFlow’s lead agent decomposes that: one sub-agent per company, one pulling funding data, one on competitor analysis — all in parallel. A final agent assembles the slide deck with generated visuals. One prompt. One deliverable.
The agent has four execution modes you can choose:
- Flash — fast, single-pass
- Standard — default
- Pro — with upfront planning
- Ultra — full sub-agent spawning for complex multi-step tasks
What It Costs
DeerFlow itself is free and open source. You pay for the LLM API calls — nothing else.
It works with any OpenAI-compatible API: GPT-4, Claude, Gemini, DeepSeek, Ollama (free, local), or any custom endpoint. You set the model in config.yaml.
Rough cost estimates for a typical research task that produces a full report:
| Model | Approx. cost per task |
|---|---|
| GPT-4o | $0.10 – $0.50 |
| Claude Sonnet | $0.08 – $0.40 |
| DeepSeek V3 | $0.01 – $0.05 |
| Ollama (local) | $0 |
These vary heavily by task complexity and how many sub-agents spawn. A simple research task with one agent is cheap. A full multi-agent run building a web app with generated visuals can run several dollars with a premium model. For budget-conscious use, DeepSeek or a local Ollama model are worth exploring — DeerFlow is model-agnostic and doesn’t care which you pick.
A Tavily API key (for web search) is also required unless you swap in a different search tool. Tavily’s free tier covers 1,000 searches/month, paid plans start at $9/month.
How to Run It
Prerequisites: Docker, or Node.js 22+ and Python (for local dev)
Option 1: Docker (fastest)
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
make config # generates config.yaml and .env from templates
Edit config.yaml — set your model:
models:
- name: gpt-4o
use: langchain_openai:ChatOpenAI
model: gpt-4o
api_key: $OPENAI_API_KEY
Edit .env — add your API keys:
OPENAI_API_KEY=sk-...
TAVILY_API_KEY=tvly-...
Then start:
make docker-init # pull sandbox image (once)
make docker-start # start everything
Open http://localhost:2026. Done.
Option 2: Local dev
make check # verifies Node.js 22+, pnpm, uv, nginx
make install # installs dependencies
make dev # starts backend + frontend
Same result, http://localhost:2026.
Skills: How the Agent Learns New Capabilities
Skills are Markdown files that define a workflow, best practices, and tools for a specific task type. DeerFlow ships with built-in skills for research, report generation, slide creation, web page building, and image/video generation.
Skills live in the sandbox at /mnt/skills/ and load progressively — only when the task needs them. This keeps context lean and avoids the token bloat that kills most long-running agent sessions.
You can add your own custom skills in /mnt/skills/custom/ — a Markdown file describing what the skill does and how to execute it is all you need.
Connecting It to Telegram, Slack, or Feishu
DeerFlow has a built-in channel system. Once configured, you can send tasks from a messaging app and get results back — no public IP required.
In config.yaml:
channels:
telegram:
enabled: true
bot_token: $TELEGRAM_BOT_TOKEN
allowed_users: [] # empty = anyone with the bot can use it
Commands from chat:
| Command | What it does |
|---|---|
/new | Start a fresh conversation |
/models | List available models |
/memory | View what the agent remembers |
/status | Current thread info |
Regular messages (no slash prefix) are treated as tasks — DeerFlow opens a thread and responds.
Using It with Claude Code
There’s a dedicated skill that lets you interact with a running DeerFlow instance directly from Claude Code:
npx skills add https://github.com/bytedance/deer-flow --skill claude-to-deerflow
With DeerFlow running at localhost:2026, use /claude-to-deerflow in Claude Code to send research tasks, manage threads, and upload files for analysis — without leaving your terminal.
The One Thing Worth Knowing
The reason DeerFlow is worth attention isn’t the star count. It’s that the sandbox is the foundation, not a feature. Most agent frameworks treat code execution as something the LLM calls when it needs to. DeerFlow inverts that: the execution environment is where the agent lives.
That’s a different design philosophy — and it’s why the community found uses for v1 that ByteDance never anticipated, and why the team felt the right response was to rebuild from scratch rather than patch the old architecture.
GitHub: github.com/bytedance/deer-flow — MIT License
Docs + demos: deerflow.tech
Running DeerFlow on Railway (Free with MiniMax + Ollama)
There’s no official Railway deployment yet — so here’s how to do it yourself. The approach uses MiniMax’s free cloud API (no local GPU needed) alongside Ollama on Railway as a second model option.
What you need
- A Railway account (free tier works)
- A MiniMax API key — free at minimaxi.com/en
- Optionally: a Tavily API key for web search (1,000 free searches/month)
Architecture on Railway
Railway Project
├── backend service — gateway (8001) + LangGraph (2024) in one container
├── frontend service — Next.js UI
└── ollama service — optional, free local inference via Railway marketplace
Persistent volume is required. LangGraph stores thread checkpoints and the agent stores memory in backend/.deer-flow/ and memory.json. Without a volume mounted at /data, all state resets on every deploy.
Deploy steps
- Fork bytedance/deer-flow — or use menonpg/deer-flow which already has the Railway files
- Railway → New Project → Deploy from GitHub → select your fork
- Add a volume to the backend service: mount path
/data, 5GB - Set env vars on the backend service:
MINIMAX_API_KEY = <your key> DEER_FLOW_CONFIG_PATH = /app/config.yaml TAVILY_API_KEY = <optional> - Set env vars on the frontend service once the backend URL is known:
NEXT_PUBLIC_BACKEND_BASE_URL = https://<backend>.railway.app NEXT_PUBLIC_LANGGRAPH_BASE_URL = https://<backend>.railway.app/api/langgraph BETTER_AUTH_SECRET = <any 32+ char random string> - Deploy both services (~3 min build time)
Adding Ollama
Add the Ollama service from the Railway marketplace, then:
OLLAMA_URL = https://<ollama-service>.railway.app (set on backend service)
Pull a model via the Railway shell on the Ollama service:
ollama pull llama3.2 # 2GB, good general-purpose model
Ollama then appears alongside MiniMax as a model choice in the DeerFlow UI.
Cost estimate
| Setup | Est. monthly |
|---|---|
| Backend + Frontend only | ~$8–15/mo |
| + Ollama (8GB for 7B models) | ~$28–50/mo |
| AI inference (MiniMax free tier) | $0 |
| AI inference (Ollama) | $0 |
The Railway files are in the railway/ directory of the fork — Dockerfile.backend, Dockerfile.frontend, config.railway.yaml, and a railway.toml. A PR to the upstream repo is in progress.