MetaClaw is an open-source Python wrapper for OpenClaw agents that adds dynamic skill injection and optional continuous RL fine-tuning. It sits as an OpenAI-compatible proxy between OpenClaw and your LLM API, injecting relevant skill instructions into every conversation turn. Available at github.com/aiming-lab/MetaClaw under MIT license.

How does MetaClaw skill injection work?

MetaClaw maintains a library of SKILL.md files in ~/.metaclaw/skills/. At each conversation turn, it retrieves the most relevant skills (via template matching or embeddings) and injects them into the system prompt. After each session, a configured LLM automatically analyzes the conversation and generates new skill summaries — growing the library from real usage.

Does MetaClaw require a GPU?

No GPU is required for the default skills-only mode. MetaClaw runs as a lightweight proxy that intercepts API calls and injects skill context. The optional RL training mode (metaclaw start --mode rl) requires a Tinker cloud API key and uses Kimi-K2.5 (1T MoE) for LoRA fine-tuning — that runs in the cloud, not locally.

What is MetaClaw's RL training mode?

MetaClaw's RL mode routes conversations through Tinker, a cloud RL service by Thinking Machines AI, using the Kimi-K2.5 1T MoE model. A judge LLM (PRM) scores each response asynchronously, Tinker runs LoRA fine-tuning, and updated weights are hot-swapped without interrupting the service. This mode is opt-in and requires a Tinker API key — it is not the default.

How do I install MetaClaw?

Two commands: `metaclaw setup` (one-time configuration wizard) then `metaclaw start` (starts the proxy with skill injection enabled). Install via pip: pip install metaclaw. Skills are auto-loaded from ~/.metaclaw/skills/. A built-in bank of 40+ skills can be pre-loaded from the repo's memory_data/skills/ directory.

Is MetaClaw really self-evolving?

Partially. In skills-only mode, MetaClaw auto-summarizes each conversation and adds new skill entries — this works without any external dependencies. True model weight evolution (what the marketing calls 'self-evolving') requires RL mode, which depends on Tinker (a third-party cloud service) and the Kimi-K2.5 model. The skills library evolves automatically; the model weights only evolve in RL mode.

What is the difference between MetaClaw skills mode and RL mode?

Skills mode (default): proxy injects top-k relevant SKILL.md files into every system prompt, auto-generates new skills after each session, no external dependencies beyond your LLM API. RL mode (opt-in): adds PRM scoring, Tinker cloud LoRA training on Kimi-K2.5, and an evolver LLM that extracts skills specifically from failure episodes. RL mode requires a Tinker API key and uses gpt-5.2 as the judge/evolver by default.

MetaClaw: Dynamic Skill Injection for OpenClaw Agents — What's Real and What's Hype

By Prahlad Menon Published 2026-03-12 3 min read

A new open-source project called MetaClaw dropped this week and the AI agent community noticed — 580 stars and 72 forks within days of launch. The LinkedIn hype machine cranked up immediately: “R.I.P. static AI agents.”

The real story is more nuanced — and actually more interesting — than the marketing copy suggests. Let’s look at what MetaClaw actually does.

What MetaClaw Is

MetaClaw is an OpenAI-compatible proxy wrapper for OpenClaw agents, built by the aiming-lab team (Peng Xia, Jianwen Chen, Xinyu Yang, Haoqin Tu, Siwei Han et al., from a research group that also produced SkillRL).

It sits between OpenClaw and your LLM API and does two things:

Skill injection — retrieves relevant skill files and injects them into every system prompt
Skill evolution — analyzes conversations after the fact and generates new skills automatically

A third, optional mode adds cloud RL fine-tuning. More on that below.

Setup is genuinely two commands:

pip install metaclaw
metaclaw setup     # one-time config wizard
metaclaw start     # proxy up, skills injected, OpenClaw wired

What Works Out of the Box (No GPU, No Cloud Account)

The default skills_only mode requires nothing beyond a working LLM API key. Here’s what you actually get:

Skill injection at every turn. MetaClaw maintains a library of SKILL.md files in ~/.metaclaw/skills/. At each conversation turn it retrieves the top-k most relevant files (by template matching or embedding similarity) and injects them into the system prompt before the LLM sees the message. The agent gets better at tasks it has skills for — immediately, no retraining required.

Auto-summarization after sessions. After each conversation, MetaClaw calls your configured LLM to analyze what happened and generate new skill summaries. If the agent solved something novel, that solution gets codified as a skill. The library grows from real usage. This part is automatic and does work without external dependencies.

40+ starter skills. The repo ships with a built-in skill bank covering coding, security, and agentic tasks. Pre-load them with:

cp -r memory_data/skills/* ~/.metaclaw/skills/

It also explicitly builds on the awesome-openclaw-skills library — the same 5,400-skill community library we covered last week — so you can pull from that ecosystem too.

What the LinkedIn Post Got Wrong (or Oversimplified)

“Auto-generates new skills every time the agent fails”

Half true. Skill auto-summarization runs after every session, not just failures. The failure-specific skill extraction is a feature of RL mode only — where a dedicated “evolver LLM” specifically mines failed episodes for new skill patterns. In skills-only mode, the LLM summarizes the whole conversation without a pass/fail signal.

“Trains via cloud LoRA — no GPU cluster, no infra headache”

This is accurate but incomplete. The RL training mode uses Tinker — a cloud RL service from Thinking Machines AI — running specifically on Kimi-K2.5 (Moonshot AI’s 1T MoE model). It’s not a general “bring your own model” cloud LoRA system. You need a Tinker API key, and your fine-tuned model is Kimi-K2.5, not whatever model you’re currently running.

The config makes this clear:

rl:
  enabled: false            # off by default
  model: moonshotai/Kimi-K2.5
  tinker_api_key: ""        # requires separate account
  prm_model: gpt-5.2        # judge LLM — also an external dependency
  evolver_model: gpt-5.2    # skill evolver — same

“Self-evolving agent”

The skill library evolves automatically — that’s real and it works. The model weights only evolve in RL mode, which is opt-in and third-party-dependent. If you use skills-only mode, your base LLM weights never change. You’re getting a smarter prompt, not a smarter model.

This is a meaningful distinction. Prompt-level improvement via skill injection is valuable and practical. Weight-level improvement via RL is more powerful but has real dependencies.

The RL Mode in Detail

For completeness, here’s what RL mode actually does when you enable it:

Each conversation turn gets tokenized and submitted as a training sample
A Process Reward Model (PRM) — configured as gpt-5.2 by default — scores each response asynchronously
Tinker runs LoRA fine-tuning on Kimi-K2.5 in the cloud
Updated weights are hot-swapped without interrupting the running service

There’s also an OPD (On-Policy Distillation) mode that lets you distill a larger teacher model (e.g., Qwen3-32B served via vLLM) into the student as it trains — a useful technique if you have access to a strong local model.

Both RL and OPD modes are disabled by default. The “just talk and it learns” headline is skills-only mode, not RL.

How It Relates to the OpenClaw Skills Ecosystem

MetaClaw explicitly builds on three projects:

OpenClaw — the agent framework it wraps
SkillRL — the same lab’s skill-augmented RL framework
awesome-openclaw-skills — provides the foundation for the starter skill bank

This is a natural evolution of the skills ecosystem. The awesome-openclaw-skills library gives you static, community-curated skills. MetaClaw makes the skill set dynamic — growing and adapting to your specific agent’s usage patterns.

The combination is genuinely interesting: start with community skills, let MetaClaw evolve them for your specific use case. The skill files remain human-readable Markdown throughout — you can inspect, edit, or delete any of them at any time.

The Honest Assessment

MetaClaw’s skills-only mode is immediately useful and zero-dependency beyond your existing LLM API. Dynamic skill injection is a practical improvement over static system prompts, and auto-summarization means your agent’s knowledge base compounds over time.

The RL mode is promising but opinionated — it locks you into Kimi-K2.5 and Tinker. That’s not a dealbreaker if those fit your stack, but it’s not the model-agnostic continuous learning the marketing implies.

Worth trying if you run OpenClaw agents. The two-command setup is real, the skills injection works, and the code is clean.

Links:

GitHub: github.com/aiming-lab/MetaClaw
SkillRL (companion paper): github.com/aiming-lab/SkillRL
Tinker (cloud RL): thinkingmachines.ai/tinker
awesome-openclaw-skills: github.com/VoltAgent/awesome-openclaw-skills