Video-Use: Drop Raw Footage in a Folder, Let Claude Code Edit It

By Prahlad Menon 2 min read

The team behind browser-use β€” the popular open-source browser automation agent β€” just released Video-Use, and it’s a different kind of video editor.

There’s no timeline. No drag-and-drop. No manual trimming. You drop raw footage into a folder, open Claude Code, and say β€œedit these into a launch video.” It does the rest.

How It Works

The core insight is elegant: the LLM never watches the video. Instead, it reads it through two layers:

Layer 1 β€” Audio transcript. ElevenLabs Scribe produces word-level timestamps, speaker diarization, and audio events (laughter, applause, pauses). All takes pack into a single ~12KB takes_packed.md file. That’s the LLM’s primary view.

Layer 2 β€” Visual composite (on demand). A timeline_view tool generates filmstrip + waveform + word-label PNGs for specific time ranges. Called only at decision points β€” ambiguous pauses, retake comparisons, cut-point checks.

The naive approach would be 30,000 frames Γ— 1,500 tokens = 45 million tokens of noise. Video-Use does it in 12KB of text + a handful of PNGs. Same philosophy as browser-use giving an LLM a structured DOM instead of raw screenshots β€” but for video.

The Pipeline

Transcribe β†’ Pack β†’ LLM Reasons β†’ EDL β†’ Render β†’ Self-Eval
                                                      β”‚
                                                      └─ issue? fix + re-render (max 3)

The self-eval loop is key. After rendering, it runs timeline_view on the output at every cut boundary β€” catching visual jumps, audio pops, hidden subtitles. You only see the preview after it passes its own quality check.

What It Actually Does

  • Cuts filler words β€” removes β€œumm,” β€œuh,” false starts, dead space between takes
  • Auto color grades β€” warm cinematic, neutral punch, or any custom FFmpeg chain
  • 30ms audio fades at every cut (no pops)
  • Burns subtitles β€” 2-word uppercase chunks by default, fully customizable
  • Generates animations via Manim, Remotion, or PIL β€” spawned as parallel sub-agents
  • Session memory β€” persists state in project.md so next week’s session picks up where you left off

Getting Started

git clone https://github.com/browser-use/video-use
cd video-use
ln -s "$(pwd)" ~/.claude/skills/video-use
pip install -e .
brew install ffmpeg

Then point Claude Code at your footage folder and describe what you want.

How It Compares to HyperFrames

We covered HyperFrames two days ago. They’re complementary, not competing:

  • HyperFrames generates videos from scratch β€” HTML compositions with GSAP animations, rendered to MP4
  • Video-Use edits existing footage β€” cuts, grades, subtitles, and polishes raw clips

One creates. The other edits. Both use Claude Code as the brain.

πŸ“ Source: github.com/browser-use/video-use (2.4K+ stars)