CLI-Anything: One Command Makes Any Software Agent-Native
There’s a gap in the AI agent ecosystem that doesn’t get talked about enough. Agents are great at calling APIs, reading files, and running shell commands. But the world’s most powerful creative software — GIMP, Blender, LibreOffice, OBS, DaVinci Resolve — has no API. It has a GUI. And agents can’t use GUIs reliably.
CLI-Anything fixes this with a single command: wrap any desktop application in a structured CLI with JSON output, and it becomes agent-native overnight.
The Problem: GUI Software Is Agent-Blind
When an AI agent needs to edit an image, render a 3D scene, or convert a document, it has two bad options today:
- Screen scraping / computer use — fragile, slow, UI-dependent. Breaks when the application updates.
- Python scripting via application APIs — works for some apps (Blender has a Python API, LibreOffice has macros), but requires deep per-app knowledge and produces unstructured output.
CLI-Anything offers a third path: a universal adapter that converts any application’s capabilities into a clean, composable CLI interface the agent can call like any other tool.
How It Works
# Install
pip install cli-anything
# Wrap GIMP
cli-anything ./gimp
# Now agents can call:
gimp-cli resize --input photo.jpg --width 1920 --output resized.jpg --format json
Output:
{
"status": "success",
"output_file": "resized.jpg",
"dimensions": {"width": 1920, "height": 1080},
"processing_time_ms": 234
}
Every wrapped command produces JSON output — the native language of LLM tool calls. No parsing, no regex extraction from human-readable text. The agent gets structured data it can act on immediately.
Why CLI Is the Right Interface for Agents
The CLI-Anything team articulates this clearly:
“CLI is the universal interface for both humans and AI agents.”
The reasons hold up:
- Self-describing:
--helpflags provide automatic documentation. An agent can discover available commands without any prior knowledge. - Composable: Commands chain naturally into complex workflows — resize, then watermark, then export is three CLI calls, not three separate GUI sessions.
- Deterministic: The same command produces the same result. No “click failed because the button moved” failures.
- Proven: Claude Code runs thousands of real workflows through CLI daily. The pattern works at scale.
What It Wraps
CLI-Anything ships with 8 demonstrated integrations:
| Application | What agents can do |
|---|---|
| GIMP | Resize, crop, filters, format conversion, batch processing |
| Blender | Scene rendering, model export, animation frames |
| LibreOffice | Document conversion, spreadsheet manipulation, PDF export |
| OBS | Scene switching, recording control, stream management |
| FFmpeg | Video transcoding, audio extraction, thumbnail generation |
| ImageMagick | Image manipulation, compositing, optimization |
| Inkscape | SVG manipulation, vector export, batch conversion |
| Pandoc | Document format conversion, metadata extraction |
The architecture supports any application — these 8 are the reference implementations. If your software has a CLI or scripting interface, CLI-Anything can wrap it.
Real-World Agent Workflows This Enables
Content pipeline agent:
User: "Resize all product photos in /raw to 800x800, add our watermark, export as WebP"
Agent:
→ gimp-cli batch-resize --dir /raw --size 800x800
→ gimp-cli watermark --template brand.xcf --output-dir /processed
→ gimp-cli convert --dir /processed --format webp
Three tool calls. No GUI interaction. Runs in seconds.
Document processing agent:
User: "Convert all the Word docs in this folder to PDF and extract the title and author from each"
Agent:
→ libreoffice-cli convert --dir ./docs --format pdf
→ libreoffice-cli extract-metadata --dir ./docs --fields title,author --format json
3D render agent:
User: "Render the product model from three angles: front, side, top"
Agent:
→ blender-cli render --file product.blend --camera front --output front.png
→ blender-cli render --file product.blend --camera side --output side.png
→ blender-cli render --file product.blend --camera top --output top.png
Compatibility
CLI-Anything explicitly targets the major agent frameworks: OpenClaw, nanobot, Cursor, Claude Code. The JSON output format maps directly to tool call response schemas used by all of them.
Requirements: Python ≥3.10, click ≥8.0. MIT license. 1,298 passing tests (unit + e2e).
pip install cli-anything
Repo: github.com/HKUDS/CLI-Anything
The Bigger Picture
We spend a lot of time making AI agents better at calling APIs that were designed for machines. CLI-Anything flips the script: it makes software that was designed for humans callable by machines.
The world has thousands of powerful, battle-tested desktop applications with no programmatic interface. CLI-Anything is the bridge. If it gains traction, the constraint of “agents can only use API-native software” effectively disappears.
“Today’s software serves humans. Tomorrow’s users will be agents.”
That’s the thesis. CLI-Anything is infrastructure for making it true.