OpenMythos: The Open-Source Reverse-Engineering of Claude's Recurrent-Depth Architecture

By Prahlad Menon 4 min read

Last week Anthropic shipped Claude Mythos. No architecture paper. No technical report. Just a new model that feels qualitatively different — especially on complex multi-step reasoning — with no public explanation of why.

OpenMythos is the community’s attempt to figure it out.

Built by Kye Gomez, it’s a theoretical open-source reconstruction of the suspected Claude Mythos architecture: a Recurrent-Depth Transformer with looped computation, switchable MLA/GQA attention, and sparse Mixture-of-Experts feed-forward layers. Not affiliated with Anthropic. Not actual weights. But a serious implementation of the architecture that the available research suggests.

pip install open-mythos

The Core Hypothesis: Looped Transformers

Standard transformers run each layer exactly once, sequentially. If you have 96 layers, every token goes through all 96, once, and that’s your forward pass.

A Recurrent-Depth Transformer (RDT) — also called a Looped Transformer — does something different. It divides its computation into three stages:

Input

[Prelude] — standard transformer layers, run once

[Recurrent Block] — looped T times (same weights, T iterations)
  ↑___________↓  (hidden state h updated each loop)

[Coda] — standard transformer layers, run once

Output

The Recurrent Block runs the same weights T times, updating the hidden state each loop according to:

h_{t+1} = A·h_t + B·e + Transformer(h_t, e)

Where h_t is the current hidden state, e is the encoded input injected at every step, and A and B are learned injection parameters. The continuous injection of e is crucial — it keeps the original signal alive and prevents the model from drifting during deep recurrence.

The key insight: same parameters, more loops = more reasoning depth, no extra parameters. And the number of loops can be varied at inference time — dial it up for hard problems, dial it down for simple ones.

Why This Architecture Produces Different Behavior

OpenMythos documents something called the three-stage grokking process that emerges from looped transformers:

  1. Memorization — model fits the training distribution
  2. In-distribution generalization — model handles known compositions
  3. Systematic generalization — model abruptly handles novel compositions OOD

That third stage — systematic generalization — is what vanilla transformers typically fail at. If you train on 5-hop reasoning chains and test on 10-hop, a standard transformer fails. A looped transformer passes. The capability doesn’t emerge gradually; it phase-transitions in.

This would explain why Mythos feels qualitatively different on novel, complex questions — not just quantitatively better on benchmarks, but structurally more capable of combining ideas it hasn’t seen combined before.

Importantly: this is not chain-of-thought. There’s no intermediate token output. All reasoning happens silently inside a single forward pass, in continuous latent space.

The Architecture Components

Prelude — Standard transformer layers that encode the input once. Output becomes e, the injection signal.

Recurrent Block — The looped computation core. Attention is switchable:

  • mla — Multi-head Latent Attention (DeepSeek-style, low-rank KV decomposition, smaller KV cache)
  • gqa — Grouped-Query Attention (standard multi-group sharing)

Feed-forward uses sparse MoE:

  • n_experts — total expert pool
  • n_shared_experts — always-active experts (DeepSeek MoE pattern)
  • n_experts_per_tok — experts activated per token by the router

Coda — Standard transformer layers that project from the final hidden state to logits.

Running It

import torch
from open_mythos.main import OpenMythos, MythosConfig

cfg = MythosConfig(
    vocab_size=1000,
    dim=256,
    n_heads=8,
    max_seq_len=128,
    max_loop_iters=4,
    prelude_layers=1,
    coda_layers=1,
    n_experts=8,
    n_shared_experts=1,
    n_experts_per_tok=2,
    expert_dim=64,
    lora_rank=8,
    attn_type="mla",   # or "gqa"
    n_kv_heads=8,
    kv_lora_rank=32,
    q_lora_rank=64,
    qk_rope_head_dim=16,
    qk_nope_head_dim=16,
    v_head_dim=16,
)

model = OpenMythos(cfg)

ids = torch.randint(0, cfg.vocab_size, (2, 16))

# Vary loops at inference time — more loops = more compute = deeper reasoning
logits = model(ids, n_loops=4)
out = model.generate(ids, max_new_tokens=8, n_loops=8)

# Verify stability: spectral radius of A must be < 1
A = model.recurrent.injection.get_A()
print(f"Spectral radius ρ(A): {A.max().item():.4f}")  # must be < 1 for stability

The n_loops parameter at inference time is the compute knob. Routine queries: low loops. Hard math: high loops. Same model weights, variable depth.

What This Is (and Isn’t)

This is a research implementation — a codebase for studying Recurrent-Depth Transformer architectures. It’s not actual Claude Mythos weights. Anthropic hasn’t released those. OpenMythos is informed speculation based on:

  • The Recurrent-Depth Transformer literature (Geiping et al., 2025)
  • DeepSeek MLA attention papers
  • MoE architecture research
  • Observed behavioral characteristics of Claude Mythos

The value is in the implementation itself — a clean, runnable, inspectable codebase that lets researchers experiment with looped computation, compute-adaptive depth, and sparse MoE at any scale.

If you’re interested in what frontier architectures might look like under the hood, or want to experiment with depth-variable reasoning without training a 100B parameter model, this is worth pulling.

Resources