Onyx: The Open-Source AI Chat That Beats Claude at Deep Research (24K Stars)
Onyx: The Open-Source AI Chat That Beats Claude at Deep Research
Another week, another open-source project making proprietary AI look expensive. This time it’s Onyx — a self-hostable AI chat platform with 24,000+ stars that just claimed the #1 spot on the DeepResearchBench leaderboard, beating Claude, Gemini, and OpenAI.
The LinkedIn crowd is predictably losing it. But is this actually a Claude killer, or is it solving a different problem entirely?
What Onyx Actually Is
Onyx is an enterprise AI chat platform that sits on top of your company’s data. Think of it as a ChatGPT-style interface, but one that:
- Works with any LLM — Claude, GPT, Gemini, DeepSeek, or local models via Ollama/vLLM
- Continuously indexes your data from 50+ sources (Slack, Drive, Confluence, Jira, GitHub, Salesforce)
- Self-hosts via Docker, Kubernetes, or Terraform — can run fully offline
- Respects existing permissions — users only see data they’re authorized to access
It’s not a model. It’s an application layer that makes any model smarter about your data.
The DeepResearchBench Numbers
Here’s the actual leaderboard (as of the latest results):
| Rank | Model | Overall Score |
|---|---|---|
| 🥇 1 | Onyx | 54.92 |
| 🥈 2 | Cellcog | 54.54 |
| 3 | Qianfan-DeepResearch Pro | 54.22 |
| 8 | LangChain + GPT-5 | 50.60 |
| 9 | Gemini 2.5 Pro Deep Research | 49.71 |
| 11 | OpenAI Deep Research | 46.45 |
| 12 | Claude Research | 45.00 |
| 17 | Perplexity Deep Research | 40.46 |
Onyx leads in comprehensiveness, insight, and instruction following. Claude comes in 12th. That’s a real gap.
But context matters: this benchmark evaluates report generation from web research — a specific task where Onyx’s hybrid RAG pipeline and multi-step research flow give it structural advantages. It doesn’t test conversational reasoning, coding ability, creative writing, or the hundred other things people use Claude for daily.
Onyx vs Claude: Honest Comparison
| Feature | Onyx | Claude |
|---|---|---|
| Models | Any LLM (Claude, GPT, Gemini, local) | Claude models only |
| Deep Research | #1 on DeepResearchBench | #12 on DeepResearchBench |
| Data Connectors | 50+ continuous indexing connectors | Runtime API tool calls (MCP) |
| Hosting | Self-hosted (Docker/K8s/Terraform) | Anthropic’s servers |
| Offline | Yes, with local LLMs | No |
| Code Execution | Sandboxed | Sandboxed (Artifacts) |
| Artifact Generation | Craft: web apps, dashboards from company data | Artifacts: from conversation context |
| Permissions | Mirrors existing org permissions | Per-project access |
| Setup | Docker deploy + connector config | Sign up and go |
| Coding Assistance | Not the focus | Best-in-class |
| Creative/Conversational | Depends on underlying LLM | Native strength |
What Onyx Gets Right
The connector model is genuinely better for enterprise search. Claude’s MCP approach queries tools at runtime — you ask a question, it calls an API, waits for results. Onyx pre-indexes everything asynchronously. When you search, it’s hitting a local hybrid index (keyword + semantic), not making live API calls. Faster, more reliable, and no risk of connector timeouts killing your query.
Craft is more powerful than Claude Artifacts. Claude Artifacts generate content from your conversation. Onyx Craft generates interactive apps and dashboards from all your indexed company data — Slack conversations, Confluence docs, Jira tickets, the works. It runs in an isolated sandbox, and the output is genuinely useful for internal dashboards and reports.
Model flexibility is a real advantage. If Anthropic raises prices, changes rate limits, or deprecates a model you depend on, you’re stuck. Onyx lets you swap between providers — or run local models entirely — without changing your workflow.
Where Onyx Falls Short
It’s an enterprise tool, not a personal assistant. If you’re a developer who wants to chat with Claude about code, brainstorm ideas, or get help writing — Onyx isn’t replacing that. It’s solving a fundamentally different problem (organizational knowledge access).
The “open source” label needs an asterisk. The main repo (24K stars) uses a custom license — not MIT, not Apache, not GPL. There’s a separate onyx-foss repo under MIT with 253 stars. If license freedom matters to your org, read the fine print.
Setup isn’t trivial. One-command Docker install gets you running, but configuring 50+ connectors, managing permissions, and maintaining infrastructure is enterprise-grade ops work. Claude’s “sign up and chat” simplicity has real value.
Deep research performance depends on the underlying LLM. Onyx’s benchmark results use a specific LLM configuration. Swap in a weaker model, and those numbers change. The platform amplifies whatever model you feed it — it doesn’t replace model quality.
Who Should Actually Care
- Enterprise teams drowning in Slack/Confluence/Drive who need unified search → Onyx is built for this
- Security-conscious orgs that can’t send data to Anthropic → self-hosted Onyx with local LLMs
- Teams already paying for multiple LLM APIs → Onyx gives you one interface for all of them
If you’re an individual developer, researcher, or writer who loves Claude’s conversational depth — keep using Claude. Onyx isn’t competing for that use case, no matter what the LinkedIn posts say.
The Bigger Picture
The real story isn’t “Onyx beats Claude.” It’s that the application layer is separating from the model layer. Models are becoming commodities. The value is moving to how you connect them to real data, real permissions, and real workflows.
Onyx, OpenClaw, Dify, n8n — they’re all betting on this thesis. The model providers know it too, which is why Anthropic built MCP and Claude’s tool-use system. But open-source platforms that index your data continuously will always have a structural search advantage over runtime-query approaches.
The question isn’t whether to use Onyx or Claude. It’s whether your organization’s knowledge is accessible to AI at all — and if so, who controls that access.
Onyx is open source (with caveats) on GitHub. Deploy with curl -fsSL https://onyx.app/install_onyx.sh | bash.