What problem does DeepDoc solve?

Most RAG systems answer specific questions, but can't help you understand a document collection holistically. DeepDoc performs actual research — extracting themes, identifying patterns, synthesizing findings — instead of just Q&A retrieval.

How does DeepDoc's research workflow work?

Upload files → semantic indexing in Qdrant → propose report structure → human refinement → multi-agent research per section (knowledge generation, targeted queries, retrieval, reflection, writing) → final report compilation into cohesive markdown.

When should I use DeepDoc vs regular RAG?

Use DeepDoc when you have a document collection to understand holistically with a research question spanning multiple sources. Use regular RAG for simple Q&A or real-time queries (DeepDoc's multi-agent pipeline has latency).

What does local-first mean for DeepDoc?

Your documents stay on your machine, research happens in your environment, and you get structured output you control. The embedding model (BGE-small) runs locally; only the reasoning LLM calls go to OpenAI/Mistral.

DeepDoc: Turn Your Local Files Into Research Reports With AI Agents

Q: What configuration options does DeepDoc offer?

max_queries (per section), search_depth (retrieval iterations), and num_reflections (refinement passes). More depth means better synthesis but slower generation. Keep shallow for quick overviews, crank up for serious research.

By Prahlad Menon Published 2026-02-26 3 min read

What if you could point an AI at a folder of documents and say “write me a research report on this”?

Not a summary. Not a Q&A chatbot. An actual structured research report with sections, citations, and synthesized insights.

That’s exactly what DeepDoc does.

The Problem With Document RAG

Most RAG systems work like this: chunk your documents, embed them, retrieve relevant chunks, and stuff them into a prompt. You ask a question, you get an answer.

But what if you don’t have a specific question? What if you want to understand a collection of documents — extract themes, identify patterns, and synthesize findings?

That’s a research workflow, not a Q&A workflow. And it requires a fundamentally different approach.

How DeepDoc Works

DeepDoc uses a multi-agent research pipeline that mimics how a human researcher would approach a document collection:

Upload your files — PDFs, DOCX, images, text files. The system extracts and chunks the content.
Semantic indexing — Chunks go into a Qdrant vector database for similarity search.
Structure generation — Based on your research query, the system proposes a report outline with sections and topics.
Human-in-the-loop — You can refine the structure before research begins.
Multi-agent research — For each section, specialized agents:
- Generate domain knowledge
- Create targeted search queries
- Run retrieval over your chunked data
- Use reflection agents to refine results
- Write the section content
Report compilation — Section content flows to a final report writer that produces a cohesive markdown document.

This isn’t RAG. It’s agentic document research.

The Architecture That Makes It Work

The key insight is separating retrieval from research. Traditional RAG conflates these — you retrieve to answer. DeepDoc retrieves to understand.

The reflection loop is particularly clever. After initial retrieval, agents evaluate whether they have enough information to write a section. If not, they generate new queries and search again. This iterative refinement mimics how researchers actually work: you don’t stop at the first source you find.

The configuration exposes useful knobs:

THREAD_CONFIG = {
    "configurable": {
        "max_queries": 3,      # Queries per section
        "search_depth": 2,     # Retrieval iterations
        "num_reflections": 2,  # Refinement passes
    }
}

More depth means better synthesis but slower generation. For a quick overview, keep it shallow. For serious research, crank it up.

Running It Locally

The setup is straightforward if you’re comfortable with Python:

git clone https://github.com/Oqura-ai/deepdoc.git
cd deepdoc
uv venv && source .venv/bin/activate
cp .env.example .env  # Add your API keys
uv pip install -r requirements.txt
docker-compose up --build  # Starts Qdrant
python main.py

You’ll need API keys for OpenAI (or configure another provider in configuration.py) and optionally Mistral. The embedding model (BGE-small) runs locally.

When To Use This

DeepDoc shines when you have:

A document collection you need to understand holistically
A research question that spans multiple sources
Time constraints that make manual review impractical

Think: legal discovery, literature reviews, competitive analysis, due diligence research. Anywhere you’d normally spend hours reading and synthesizing.

It’s less useful for simple Q&A (just use regular RAG) or real-time queries (the multi-agent pipeline has latency).

The Bigger Picture

We’re seeing a shift from “RAG as a feature” to “RAG as infrastructure.” Tools like DeepDoc don’t just retrieve — they reason, reflect, and synthesize. The vector database becomes one component in a larger agentic workflow.

This is where local-first AI gets interesting. Your documents stay on your machine. The research happens in your environment. You get structured output you can actually use.

If you’re sitting on a pile of documents and wondering what’s in them, give DeepDoc a try. The GitHub repo has everything you need to get started.

DeepDoc is MIT licensed and actively maintained by Oqura AI.