What is GitHub Copilot CLI?

GitHub Copilot CLI is a terminal-based AI coding agent from GitHub that works directly with your repositories. It can reference GitHub issues and pull requests in your session, run coding tasks in your codebase, and now supports parallel subagents via /fleet. It runs inside your existing GitHub Business or Enterprise policies.

What is /fleet in Copilot CLI?

/fleet is a slash command that enables Copilot to break your task into independent work items and dispatch multiple AI subagents to execute them simultaneously — different files, different parts of your codebase, all at once. An orchestrator plans the dependency graph, runs parallel waves, and synthesizes the results.

Type '/fleet ' in Copilot CLI. Example: '/fleet Refactor the auth module, update tests, and fix the related docs in docs/auth/'. For non-interactive pipelines, use 'copilot -p "/fleet " --no-ask-user'. The more specific your deliverables (concrete file names, module boundaries), the better the parallelization.

Does Copilot CLI work with Ollama?

Yes — Ollama now supports GitHub Copilot CLI as a backend, meaning you can use local Ollama models as the underlying LLM for Copilot CLI sessions. This keeps your code and context local while using Copilot's GitHub-aware interface and orchestration.

How does Copilot CLI access GitHub context?

Refer to GitHub issues and pull requests directly in your session (e.g. 'tell me about issue #15291'). Copilot brings in comments, diffs, and status — so your agent has full context about what the issue describes, what code changed, and what the current review state is.

How does /fleet handle task dependencies?

The /fleet orchestrator decomposes your task into discrete work items, identifies which can run in parallel vs which must wait, dispatches independent items as background subagents simultaneously, polls for completion, then dispatches the next wave. Sub-agents share the same filesystem but can't communicate directly — only the orchestrator coordinates.

What makes a good /fleet prompt?

Map every work item to a concrete artifact (a file, a test suite, a doc section). Specify file or module boundaries so subagents know their scope. Declare dependencies explicitly (e.g. 'docs/index.md linking to the others — depends on them finishing first'). Vague prompts lead to sequential execution because the orchestrator can't identify independent pieces.

Does /fleet work non-interactively in CI/CD?

Yes. Run 'copilot -p "/fleet " --no-ask-user' for non-interactive pipelines. The --no-ask-user flag is required since there's no way to respond to prompts in automated environments.

GitHub Copilot CLI Gets Parallel Subagents: How /fleet Works and Why It Matters for Ollama Users

By Prahlad Menon Published 2026-04-19 4 min read

Two things happened quietly this month that, together, are worth paying attention to.

First: GitHub shipped /fleet in Copilot CLI — a slash command that dispatches multiple AI subagents to work on different parts of your codebase simultaneously, coordinated by an orchestrator that manages dependencies and synthesizes results.

Second: Ollama added support for GitHub Copilot CLI as a backend, meaning local models can now power the same GitHub-aware interface that references issues, PRs, and diffs inline.

The combination is a meaningful step toward AI coding workflows that are both capable and private.

What /fleet Does

The standard complaint about AI coding agents is that they work sequentially. You ask the agent to refactor auth, update tests, and fix docs. It does them one at a time, slowly, losing context as it goes.

/fleet changes the execution model. When you run /fleet <OBJECTIVE>, a behind-the-scenes orchestrator:

Decomposes your task into discrete work items with explicit dependencies
Identifies which items can run in parallel vs. which must wait
Dispatches independent items as background subagents simultaneously
Polls for completion, then dispatches the next wave
Verifies outputs and synthesizes the final result

Each subagent gets its own context window but shares the same filesystem. They can’t communicate directly — only the orchestrator coordinates. Think of it as a project lead who assigns work to a team, checks on progress, and assembles the deliverable.

# Simple usage
/fleet Refactor the auth module, update tests, and fix docs in docs/auth/

# More parallelizable — explicit deliverables
/fleet Create API documentation:
- docs/authentication.md covering token flow and examples
- docs/endpoints.md with all REST endpoint schemas
- docs/errors.md with error codes and troubleshooting
- docs/index.md linking to all three (depends on others finishing first)

The second prompt gives the orchestrator four distinct artifacts. Three can run in parallel; one has an explicit dependency. That’s the structure that maximizes parallel execution.

Writing /fleet Prompts That Work

The quality of the prompt determines how well work gets distributed.

Be specific about deliverables. Map every work item to a concrete artifact — a file, a test suite, a documentation section. “Refactor auth” is too vague for the orchestrator to parallelize. “Refactor src/auth/tokens.py, update tests/test_tokens.py, update docs/auth/tokens.md” gives it three parallel tracks.

Set explicit boundaries. Tell each track which files or directories it owns. Sub-agents that can stomp on each other’s files will create conflicts the orchestrator then has to resolve sequentially.

Declare dependencies explicitly. If one artifact depends on another, say so. The orchestrator builds a dependency graph — the more explicit you are, the more accurate it is.

Non-interactive pipeline usage:

copilot -p "/fleet <TASK>" --no-ask-user

The --no-ask-user flag is required for CI/CD — there’s no interactive session to respond to prompts.

GitHub Context Injection

Separately from /fleet, Copilot CLI’s GitHub context awareness is genuinely useful. Reference any issue or PR directly in your session:

tell me about issue #15291

Copilot brings in the issue description, comments, related diffs, and current status — all in your current context window. Your agent is now working with the full picture of what needs to be done, not just the code in front of it.

This is the right direction for coding agents. Most of the context that matters for a coding task isn’t in the code itself — it’s in the issue that motivated it, the PR review that caught the edge case, the comment thread that debated the approach. Copilot CLI surfaces all of that inline.

The Ollama Connection

Ollama’s Copilot CLI support means the GitHub-aware interface and the /fleet orchestration can run against local models. You get:

GitHub context (issues, PRs, diffs) in your session
Parallel subagents via /fleet
Local model inference via Ollama — nothing leaves your machine
Existing GitHub Business/Enterprise policies enforced

For teams with data residency requirements or cost constraints on cloud inference, this is meaningful. The orchestration and GitHub API calls go through GitHub; the LLM calls stay local.

How This Compares

Tool	Parallel agents	GitHub context	Local models	Open-source
Copilot CLI + /fleet	✅	✅	✅ (via Ollama)	❌
Claude Code	❌ (sequential)	✅ (via MCP)	❌	❌
OpenCode	❌	❌	✅	✅
Aider	❌	❌	✅	✅

The /fleet orchestration is the piece no open-source tool has fully replicated yet. The dependency-aware parallel dispatch is nontrivial — it requires the orchestrator to build and maintain a task graph, manage shared state, and synthesize results across agents that ran in different contexts.

Resources

Copilot CLI: github.com/features/copilot/cli
GitHub blog — /fleet deep dive: github.blog/ai-and-ml/github-copilot/run-multiple-agents-at-once-with-fleet-in-copilot-cli
Ollama: ollama.com