GitHub Copilot CLI Gets Parallel Subagents: How /fleet Works and Why It Matters for Ollama Users
Two things happened quietly this month that, together, are worth paying attention to.
First: GitHub shipped /fleet in Copilot CLI — a slash command that dispatches multiple AI subagents to work on different parts of your codebase simultaneously, coordinated by an orchestrator that manages dependencies and synthesizes results.
Second: Ollama added support for GitHub Copilot CLI as a backend, meaning local models can now power the same GitHub-aware interface that references issues, PRs, and diffs inline.
The combination is a meaningful step toward AI coding workflows that are both capable and private.
What /fleet Does
The standard complaint about AI coding agents is that they work sequentially. You ask the agent to refactor auth, update tests, and fix docs. It does them one at a time, slowly, losing context as it goes.
/fleet changes the execution model. When you run /fleet <OBJECTIVE>, a behind-the-scenes orchestrator:
- Decomposes your task into discrete work items with explicit dependencies
- Identifies which items can run in parallel vs. which must wait
- Dispatches independent items as background subagents simultaneously
- Polls for completion, then dispatches the next wave
- Verifies outputs and synthesizes the final result
Each subagent gets its own context window but shares the same filesystem. They can’t communicate directly — only the orchestrator coordinates. Think of it as a project lead who assigns work to a team, checks on progress, and assembles the deliverable.
# Simple usage
/fleet Refactor the auth module, update tests, and fix docs in docs/auth/
# More parallelizable — explicit deliverables
/fleet Create API documentation:
- docs/authentication.md covering token flow and examples
- docs/endpoints.md with all REST endpoint schemas
- docs/errors.md with error codes and troubleshooting
- docs/index.md linking to all three (depends on others finishing first)
The second prompt gives the orchestrator four distinct artifacts. Three can run in parallel; one has an explicit dependency. That’s the structure that maximizes parallel execution.
Writing /fleet Prompts That Work
The quality of the prompt determines how well work gets distributed.
Be specific about deliverables. Map every work item to a concrete artifact — a file, a test suite, a documentation section. “Refactor auth” is too vague for the orchestrator to parallelize. “Refactor src/auth/tokens.py, update tests/test_tokens.py, update docs/auth/tokens.md” gives it three parallel tracks.
Set explicit boundaries. Tell each track which files or directories it owns. Sub-agents that can stomp on each other’s files will create conflicts the orchestrator then has to resolve sequentially.
Declare dependencies explicitly. If one artifact depends on another, say so. The orchestrator builds a dependency graph — the more explicit you are, the more accurate it is.
Non-interactive pipeline usage:
copilot -p "/fleet <TASK>" --no-ask-user
The --no-ask-user flag is required for CI/CD — there’s no interactive session to respond to prompts.
GitHub Context Injection
Separately from /fleet, Copilot CLI’s GitHub context awareness is genuinely useful. Reference any issue or PR directly in your session:
tell me about issue #15291
Copilot brings in the issue description, comments, related diffs, and current status — all in your current context window. Your agent is now working with the full picture of what needs to be done, not just the code in front of it.
This is the right direction for coding agents. Most of the context that matters for a coding task isn’t in the code itself — it’s in the issue that motivated it, the PR review that caught the edge case, the comment thread that debated the approach. Copilot CLI surfaces all of that inline.
The Ollama Connection
Ollama’s Copilot CLI support means the GitHub-aware interface and the /fleet orchestration can run against local models. You get:
- GitHub context (issues, PRs, diffs) in your session
- Parallel subagents via
/fleet - Local model inference via Ollama — nothing leaves your machine
- Existing GitHub Business/Enterprise policies enforced
For teams with data residency requirements or cost constraints on cloud inference, this is meaningful. The orchestration and GitHub API calls go through GitHub; the LLM calls stay local.
How This Compares
| Tool | Parallel agents | GitHub context | Local models | Open-source |
|---|---|---|---|---|
| Copilot CLI + /fleet | ✅ | ✅ | ✅ (via Ollama) | ❌ |
| Claude Code | ❌ (sequential) | ✅ (via MCP) | ❌ | ❌ |
| OpenCode | ❌ | ❌ | ✅ | ✅ |
| Aider | ❌ | ❌ | ✅ | ✅ |
The /fleet orchestration is the piece no open-source tool has fully replicated yet. The dependency-aware parallel dispatch is nontrivial — it requires the orchestrator to build and maintain a task graph, manage shared state, and synthesize results across agents that ran in different contexts.
Resources
- Copilot CLI: github.com/features/copilot/cli
- GitHub blog — /fleet deep dive: github.blog/ai-and-ml/github-copilot/run-multiple-agents-at-once-with-fleet-in-copilot-cli
- Ollama: ollama.com