What makes DeerFlow different from other AI agents?

DeerFlow gives the AI its own computer — a real isolated Docker container with a full filesystem. The agent reads/writes files, executes bash commands, runs Python scripts, and results persist across steps. It doesn't generate code for you to copy; it builds things inside its sandbox and delivers finished files.

What can DeerFlow do out of the box?

Deep web research with cited sources, code generation/execution in sandbox, full website scaffolding, slide deck creation, image/video generation, Python script execution, multi-agent parallelization for complex tasks, and long-term memory across sessions.

What does DeerFlow cost to run?

DeerFlow itself is free (MIT license). You pay for LLM API calls: GPT-4o ~$0.10-0.50/task, Claude Sonnet ~$0.08-0.40/task, DeepSeek V3 ~$0.01-0.05/task, Ollama (local) $0. Tavily API (web search) has 1,000 free searches/month.

How do I run DeerFlow?

Clone the repo, run 'make config' to generate config files, edit config.yaml with your model settings and .env with API keys, run 'make docker-init' then 'make docker-start'. Open http://localhost:2026. Takes under 10 minutes.

What are DeerFlow's execution modes?

Four modes: Flash (fast single-pass), Standard (default), Pro (with upfront planning), and Ultra (full sub-agent spawning for complex multi-step tasks). The lead agent decomposes tasks into sub-tasks and spawns parallel sub-agents.

Can I connect DeerFlow to messaging apps?

Yes — built-in channel system for Telegram, Slack, and Feishu. Configure bot tokens in config.yaml, then send tasks via chat commands (/new, /models, /memory, /status). Regular messages are treated as tasks.

DeerFlow: ByteDance's Open-Source SuperAgent — What It Is, What It Costs, How to Run It

By Prahlad Menon Published 2026-03-10 5 min read

Most “AI agent” tools are chatbots with a search button. DeerFlow is something different — and understanding why requires looking at one design decision that everything else follows from.

DeerFlow gives the AI its own computer.

Not figuratively. A real isolated Docker container with a full filesystem. The agent reads and writes files, executes bash commands, runs Python scripts, and the results persist across steps. When you ask it to build a website, it doesn’t generate code for you to copy. It builds the website, inside its sandbox, and delivers the finished file.

That’s the core idea. Everything else — the sub-agents, the memory, the skills system — is built on top of that foundation.

What DeerFlow Actually Is

DeerFlow (Deep Exploration and Efficient Research Flow) started as a deep research tool from ByteDance. The community took it further than intended — data pipelines, automated content workflows, full application scaffolding — and ByteDance concluded they’d accidentally built a general-purpose agent harness. So they scrapped the original codebase entirely.

DeerFlow 2.0 launched February 28, 2026. It hit #1 on GitHub Trending the same day and now sits at 27,900+ stars and 3,300+ forks. MIT licensed. Built on LangGraph and LangChain. The original v1 research framework still lives on the 1.x branch if you need it.

What It Can Do

A lead agent takes your task, breaks it into sub-tasks, and spawns parallel sub-agents to handle each one. Results converge. You get a finished deliverable.

Out of the box:

Deep web research with cited sources and structured reports
Code generation and execution inside the sandbox
Full website and web app scaffolding
Slide deck creation from scratch
Image and video generation
Python script execution
Multi-agent parallelization for complex tasks
Long-term memory across sessions

Example: “Research the top 10 AI startups in 2026 and build me a presentation.”

DeerFlow’s lead agent decomposes that: one sub-agent per company, one pulling funding data, one on competitor analysis — all in parallel. A final agent assembles the slide deck with generated visuals. One prompt. One deliverable.

The agent has four execution modes you can choose:

Flash — fast, single-pass
Standard — default
Pro — with upfront planning
Ultra — full sub-agent spawning for complex multi-step tasks

What It Costs

DeerFlow itself is free and open source. You pay for the LLM API calls — nothing else.

It works with any OpenAI-compatible API: GPT-4, Claude, Gemini, DeepSeek, Ollama (free, local), or any custom endpoint. You set the model in config.yaml.

Rough cost estimates for a typical research task that produces a full report:

Model	Approx. cost per task
GPT-4o	$0.10 – $0.50
Claude Sonnet	$0.08 – $0.40
DeepSeek V3	$0.01 – $0.05
Ollama (local)	$0

These vary heavily by task complexity and how many sub-agents spawn. A simple research task with one agent is cheap. A full multi-agent run building a web app with generated visuals can run several dollars with a premium model. For budget-conscious use, DeepSeek or a local Ollama model are worth exploring — DeerFlow is model-agnostic and doesn’t care which you pick.

A Tavily API key (for web search) is also required unless you swap in a different search tool. Tavily’s free tier covers 1,000 searches/month, paid plans start at $9/month.

How to Run It

Prerequisites: Docker, or Node.js 22+ and Python (for local dev)

Option 1: Docker (fastest)

git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
make config          # generates config.yaml and .env from templates

Edit config.yaml — set your model:

models:
  - name: gpt-4o
    use: langchain_openai:ChatOpenAI
    model: gpt-4o
    api_key: $OPENAI_API_KEY

Edit .env — add your API keys:

OPENAI_API_KEY=sk-...
TAVILY_API_KEY=tvly-...

Then start:

make docker-init     # pull sandbox image (once)
make docker-start    # start everything

Open http://localhost:2026. Done.

Option 2: Local dev

make check           # verifies Node.js 22+, pnpm, uv, nginx
make install         # installs dependencies
make dev             # starts backend + frontend

Same result, http://localhost:2026.

Skills: How the Agent Learns New Capabilities

Skills are Markdown files that define a workflow, best practices, and tools for a specific task type. DeerFlow ships with built-in skills for research, report generation, slide creation, web page building, and image/video generation.

Skills live in the sandbox at /mnt/skills/ and load progressively — only when the task needs them. This keeps context lean and avoids the token bloat that kills most long-running agent sessions.

You can add your own custom skills in /mnt/skills/custom/ — a Markdown file describing what the skill does and how to execute it is all you need.

Connecting It to Telegram, Slack, or Feishu

DeerFlow has a built-in channel system. Once configured, you can send tasks from a messaging app and get results back — no public IP required.

In config.yaml:

channels:
  telegram:
    enabled: true
    bot_token: $TELEGRAM_BOT_TOKEN
    allowed_users: []    # empty = anyone with the bot can use it

Commands from chat:

Command	What it does
`/new`	Start a fresh conversation
`/models`	List available models
`/memory`	View what the agent remembers
`/status`	Current thread info

Regular messages (no slash prefix) are treated as tasks — DeerFlow opens a thread and responds.

Using It with Claude Code

There’s a dedicated skill that lets you interact with a running DeerFlow instance directly from Claude Code:

npx skills add https://github.com/bytedance/deer-flow --skill claude-to-deerflow

With DeerFlow running at localhost:2026, use /claude-to-deerflow in Claude Code to send research tasks, manage threads, and upload files for analysis — without leaving your terminal.

The One Thing Worth Knowing

The reason DeerFlow is worth attention isn’t the star count. It’s that the sandbox is the foundation, not a feature. Most agent frameworks treat code execution as something the LLM calls when it needs to. DeerFlow inverts that: the execution environment is where the agent lives.

That’s a different design philosophy — and it’s why the community found uses for v1 that ByteDance never anticipated, and why the team felt the right response was to rebuild from scratch rather than patch the old architecture.

GitHub: github.com/bytedance/deer-flow — MIT License
Docs + demos: deerflow.tech

Running DeerFlow on Railway (Free with MiniMax + Ollama)

There’s no official Railway deployment yet — so here’s how to do it yourself. The approach uses MiniMax’s free cloud API (no local GPU needed) alongside Ollama on Railway as a second model option.

What you need

A Railway account (free tier works)
A MiniMax API key — free at minimaxi.com/en
Optionally: a Tavily API key for web search (1,000 free searches/month)

Architecture on Railway

Railway Project
├── backend service  — gateway (8001) + LangGraph (2024) in one container
├── frontend service — Next.js UI
└── ollama service   — optional, free local inference via Railway marketplace

Persistent volume is required. LangGraph stores thread checkpoints and the agent stores memory in backend/.deer-flow/ and memory.json. Without a volume mounted at /data, all state resets on every deploy.

Deploy steps

Fork bytedance/deer-flow — or use menonpg/deer-flow which already has the Railway files
Railway → New Project → Deploy from GitHub → select your fork
Add a volume to the backend service: mount path /data, 5GB

Set env vars on the backend service:

MINIMAX_API_KEY = <your key>
DEER_FLOW_CONFIG_PATH = /app/config.yaml
TAVILY_API_KEY = <optional>

Set env vars on the frontend service once the backend URL is known:

NEXT_PUBLIC_BACKEND_BASE_URL = https://<backend>.railway.app
NEXT_PUBLIC_LANGGRAPH_BASE_URL = https://<backend>.railway.app/api/langgraph
BETTER_AUTH_SECRET = <any 32+ char random string>

Deploy both services (~3 min build time)

Adding Ollama

Add the Ollama service from the Railway marketplace, then:

OLLAMA_URL = https://<ollama-service>.railway.app   (set on backend service)

Pull a model via the Railway shell on the Ollama service:

ollama pull llama3.2    # 2GB, good general-purpose model

Ollama then appears alongside MiniMax as a model choice in the DeerFlow UI.

Cost estimate

Setup	Est. monthly
Backend + Frontend only	~$8–15/mo
+ Ollama (8GB for 7B models)	~$28–50/mo
AI inference (MiniMax free tier)	$0
AI inference (Ollama)	$0

The Railway files are in the railway/ directory of the fork — Dockerfile.backend, Dockerfile.frontend, config.railway.yaml, and a railway.toml. A PR to the upstream repo is in progress.