CLI-Anything: One Command Makes Any Software Agent-Native

By Prahlad Menon 2 min read

There’s a gap in the AI agent ecosystem that doesn’t get talked about enough. Agents are great at calling APIs, reading files, and running shell commands. But the world’s most powerful creative software — GIMP, Blender, LibreOffice, OBS, DaVinci Resolve — has no API. It has a GUI. And agents can’t use GUIs reliably.

CLI-Anything fixes this with a single command: wrap any desktop application in a structured CLI with JSON output, and it becomes agent-native overnight.


The Problem: GUI Software Is Agent-Blind

When an AI agent needs to edit an image, render a 3D scene, or convert a document, it has two bad options today:

  1. Screen scraping / computer use — fragile, slow, UI-dependent. Breaks when the application updates.
  2. Python scripting via application APIs — works for some apps (Blender has a Python API, LibreOffice has macros), but requires deep per-app knowledge and produces unstructured output.

CLI-Anything offers a third path: a universal adapter that converts any application’s capabilities into a clean, composable CLI interface the agent can call like any other tool.


How It Works

# Install
pip install cli-anything

# Wrap GIMP
cli-anything ./gimp

# Now agents can call:
gimp-cli resize --input photo.jpg --width 1920 --output resized.jpg --format json

Output:

{
  "status": "success",
  "output_file": "resized.jpg",
  "dimensions": {"width": 1920, "height": 1080},
  "processing_time_ms": 234
}

Every wrapped command produces JSON output — the native language of LLM tool calls. No parsing, no regex extraction from human-readable text. The agent gets structured data it can act on immediately.


Why CLI Is the Right Interface for Agents

The CLI-Anything team articulates this clearly:

“CLI is the universal interface for both humans and AI agents.”

The reasons hold up:

  • Self-describing: --help flags provide automatic documentation. An agent can discover available commands without any prior knowledge.
  • Composable: Commands chain naturally into complex workflows — resize, then watermark, then export is three CLI calls, not three separate GUI sessions.
  • Deterministic: The same command produces the same result. No “click failed because the button moved” failures.
  • Proven: Claude Code runs thousands of real workflows through CLI daily. The pattern works at scale.

What It Wraps

CLI-Anything ships with 8 demonstrated integrations:

ApplicationWhat agents can do
GIMPResize, crop, filters, format conversion, batch processing
BlenderScene rendering, model export, animation frames
LibreOfficeDocument conversion, spreadsheet manipulation, PDF export
OBSScene switching, recording control, stream management
FFmpegVideo transcoding, audio extraction, thumbnail generation
ImageMagickImage manipulation, compositing, optimization
InkscapeSVG manipulation, vector export, batch conversion
PandocDocument format conversion, metadata extraction

The architecture supports any application — these 8 are the reference implementations. If your software has a CLI or scripting interface, CLI-Anything can wrap it.


Real-World Agent Workflows This Enables

Content pipeline agent:

User: "Resize all product photos in /raw to 800x800, add our watermark, export as WebP"
Agent: 
  → gimp-cli batch-resize --dir /raw --size 800x800
  → gimp-cli watermark --template brand.xcf --output-dir /processed
  → gimp-cli convert --dir /processed --format webp

Three tool calls. No GUI interaction. Runs in seconds.

Document processing agent:

User: "Convert all the Word docs in this folder to PDF and extract the title and author from each"
Agent:
  → libreoffice-cli convert --dir ./docs --format pdf
  → libreoffice-cli extract-metadata --dir ./docs --fields title,author --format json

3D render agent:

User: "Render the product model from three angles: front, side, top"
Agent:
  → blender-cli render --file product.blend --camera front --output front.png
  → blender-cli render --file product.blend --camera side --output side.png  
  → blender-cli render --file product.blend --camera top --output top.png

Compatibility

CLI-Anything explicitly targets the major agent frameworks: OpenClaw, nanobot, Cursor, Claude Code. The JSON output format maps directly to tool call response schemas used by all of them.

Requirements: Python ≥3.10, click ≥8.0. MIT license. 1,298 passing tests (unit + e2e).

pip install cli-anything

Repo: github.com/HKUDS/CLI-Anything


The Bigger Picture

We spend a lot of time making AI agents better at calling APIs that were designed for machines. CLI-Anything flips the script: it makes software that was designed for humans callable by machines.

The world has thousands of powerful, battle-tested desktop applications with no programmatic interface. CLI-Anything is the bridge. If it gains traction, the constraint of “agents can only use API-native software” effectively disappears.

“Today’s software serves humans. Tomorrow’s users will be agents.”

That’s the thesis. CLI-Anything is infrastructure for making it true.