Onyx is an open-source AI chat platform with 24,000+ GitHub stars that works with any LLM — Claude, GPT, Gemini, or local models via Ollama. It includes RAG-based search over your company data, deep research, code execution, web search, and a feature called Craft for generating interactive web apps and dashboards from indexed documents.

Is Onyx really better than Claude at deep research?

On the DeepResearchBench leaderboard, Onyx scored 54.92 overall vs Claude Research's 45.00 — ranking #1 across comprehensiveness, insight, and instruction following. However, benchmarks test specific report-generation tasks. Claude excels at conversational reasoning, coding, and creative work that this benchmark doesn't measure.

Is Onyx truly open source?

It's complicated. The main onyx-dot-app/onyx repo (24K stars) uses a custom license, not a standard OSI-approved open-source license. There is a separate MIT-licensed fork (onyx-foss) with only 253 stars. If you care about license freedom, check which repo you're deploying.

Can Onyx replace Claude for personal use?

Not directly. Onyx is designed as an enterprise knowledge platform — it shines when connected to company data sources like Slack, Drive, Confluence, and Jira. For personal AI chat, creative writing, or coding assistance, Claude's native capabilities are more polished and require zero setup.

How does Onyx connect to company data?

Onyx provides 50+ pre-built connectors that continuously index and sync data from sources like Google Drive, Slack, GitHub, Confluence, Jira, Salesforce, and more. Unlike Claude's runtime API tool calls, Onyx pre-indexes everything so search is faster and respects existing user permissions.

How do I self-host Onyx?

Onyx deploys with a single command: curl -fsSL https://onyx.app/install_onyx.sh | bash. It runs via Docker and supports Kubernetes and Terraform for production deployments. It works entirely offline with local LLMs if needed.

Craft is Onyx's artifact generation feature. It runs an AI coding agent in an isolated sandbox that can generate interactive web apps, dashboards, spreadsheets, charts, and presentations — all drawing from your indexed company data rather than just uploaded files.

How does Onyx compare to other open-source AI chat platforms?

Onyx focuses on enterprise RAG and knowledge management. Alternatives like Open WebUI are simpler chat frontends, LibreChat offers multi-provider chat, and OpenClaw is an AI operating system for device control. Onyx's differentiator is its continuous data indexing and enterprise connector ecosystem.

Onyx: The Open-Source AI Chat That Beats Claude at Deep Research (24K Stars)

By Prahlad Menon Published 2026-04-04 2 min read

Onyx: The Open-Source AI Chat That Beats Claude at Deep Research

Another week, another open-source project making proprietary AI look expensive. This time it’s Onyx — a self-hostable AI chat platform with 24,000+ stars that just claimed the #1 spot on the DeepResearchBench leaderboard, beating Claude, Gemini, and OpenAI.

The LinkedIn crowd is predictably losing it. But is this actually a Claude killer, or is it solving a different problem entirely?

What Onyx Actually Is

Onyx is an enterprise AI chat platform that sits on top of your company’s data. Think of it as a ChatGPT-style interface, but one that:

Works with any LLM — Claude, GPT, Gemini, DeepSeek, or local models via Ollama/vLLM
Continuously indexes your data from 50+ sources (Slack, Drive, Confluence, Jira, GitHub, Salesforce)
Self-hosts via Docker, Kubernetes, or Terraform — can run fully offline
Respects existing permissions — users only see data they’re authorized to access

It’s not a model. It’s an application layer that makes any model smarter about your data.

The DeepResearchBench Numbers

Here’s the actual leaderboard (as of the latest results):

Rank	Model	Overall Score
🥇 1	Onyx	54.92
🥈 2	Cellcog	54.54
3	Qianfan-DeepResearch Pro	54.22
8	LangChain + GPT-5	50.60
9	Gemini 2.5 Pro Deep Research	49.71
11	OpenAI Deep Research	46.45
12	Claude Research	45.00
17	Perplexity Deep Research	40.46

Onyx leads in comprehensiveness, insight, and instruction following. Claude comes in 12th. That’s a real gap.

But context matters: this benchmark evaluates report generation from web research — a specific task where Onyx’s hybrid RAG pipeline and multi-step research flow give it structural advantages. It doesn’t test conversational reasoning, coding ability, creative writing, or the hundred other things people use Claude for daily.

Onyx vs Claude: Honest Comparison

Feature	Onyx	Claude
Models	Any LLM (Claude, GPT, Gemini, local)	Claude models only
Deep Research	#1 on DeepResearchBench	#12 on DeepResearchBench
Data Connectors	50+ continuous indexing connectors	Runtime API tool calls (MCP)
Hosting	Self-hosted (Docker/K8s/Terraform)	Anthropic’s servers
Offline	Yes, with local LLMs	No
Code Execution	Sandboxed	Sandboxed (Artifacts)
Artifact Generation	Craft: web apps, dashboards from company data	Artifacts: from conversation context
Permissions	Mirrors existing org permissions	Per-project access
Setup	Docker deploy + connector config	Sign up and go
Coding Assistance	Not the focus	Best-in-class
Creative/Conversational	Depends on underlying LLM	Native strength

What Onyx Gets Right

The connector model is genuinely better for enterprise search. Claude’s MCP approach queries tools at runtime — you ask a question, it calls an API, waits for results. Onyx pre-indexes everything asynchronously. When you search, it’s hitting a local hybrid index (keyword + semantic), not making live API calls. Faster, more reliable, and no risk of connector timeouts killing your query.

Craft is more powerful than Claude Artifacts. Claude Artifacts generate content from your conversation. Onyx Craft generates interactive apps and dashboards from all your indexed company data — Slack conversations, Confluence docs, Jira tickets, the works. It runs in an isolated sandbox, and the output is genuinely useful for internal dashboards and reports.

Model flexibility is a real advantage. If Anthropic raises prices, changes rate limits, or deprecates a model you depend on, you’re stuck. Onyx lets you swap between providers — or run local models entirely — without changing your workflow.

Where Onyx Falls Short

It’s an enterprise tool, not a personal assistant. If you’re a developer who wants to chat with Claude about code, brainstorm ideas, or get help writing — Onyx isn’t replacing that. It’s solving a fundamentally different problem (organizational knowledge access).

The “open source” label needs an asterisk. The main repo (24K stars) uses a custom license — not MIT, not Apache, not GPL. There’s a separate onyx-foss repo under MIT with 253 stars. If license freedom matters to your org, read the fine print.

Setup isn’t trivial. One-command Docker install gets you running, but configuring 50+ connectors, managing permissions, and maintaining infrastructure is enterprise-grade ops work. Claude’s “sign up and chat” simplicity has real value.

Deep research performance depends on the underlying LLM. Onyx’s benchmark results use a specific LLM configuration. Swap in a weaker model, and those numbers change. The platform amplifies whatever model you feed it — it doesn’t replace model quality.

Who Should Actually Care

Enterprise teams drowning in Slack/Confluence/Drive who need unified search → Onyx is built for this
Security-conscious orgs that can’t send data to Anthropic → self-hosted Onyx with local LLMs
Teams already paying for multiple LLM APIs → Onyx gives you one interface for all of them

If you’re an individual developer, researcher, or writer who loves Claude’s conversational depth — keep using Claude. Onyx isn’t competing for that use case, no matter what the LinkedIn posts say.

The Bigger Picture

The real story isn’t “Onyx beats Claude.” It’s that the application layer is separating from the model layer. Models are becoming commodities. The value is moving to how you connect them to real data, real permissions, and real workflows.

Onyx, OpenClaw, Dify, n8n — they’re all betting on this thesis. The model providers know it too, which is why Anthropic built MCP and Claude’s tool-use system. But open-source platforms that index your data continuously will always have a structural search advantage over runtime-query approaches.

The question isn’t whether to use Onyx or Claude. It’s whether your organization’s knowledge is accessible to AI at all — and if so, who controls that access.

Onyx is open source (with caveats) on GitHub. Deploy with curl -fsSL https://onyx.app/install_onyx.sh | bash.