RuVector: The Vector Database That Gets Smarter Every Time You Use It
Your vector database is lying to you.
Not maliciously. But every time you run a similarity search, it returns the exact same results for the exact same query. It doesn’t learn that users who search for “authentication” usually want the OAuth docs, not the password reset guide. It doesn’t notice that your click-through patterns suggest the 5th result should actually be 1st.
It just… sits there. Static. Frozen in time.
RuVector is fundamentally different. It’s a vector database that watches how you use it and gets smarter — search results improve automatically, the system tunes itself to your workload, and it runs AI models right on your hardware with no cloud API bills.
Let’s dig into how it actually works.
The Problem With Static Vector Search
Here’s the typical vector database workflow:
1. Embed your documents → vectors
2. Index them (HNSW, IVF, etc.)
3. Query with a vector
4. Get results based on cosine/L2 distance
5. Repeat forever with identical results
This works. But it’s fundamentally limited.
The index doesn’t learn. If users consistently ignore certain results and click on others, that signal is lost. If query patterns shift over time, the index doesn’t adapt. If your domain has specific relationships (like “authentication” being semantically close to “OAuth” in your codebase but not in general English), the embedding model doesn’t know.
You’re stuck with whatever the embedding model decided during its original training — usually on generic web text, not your specific domain.
RuVector’s Architecture: Self-Learning at Every Layer
RuVector approaches this differently. Instead of treating the index as a static lookup table, it treats it as a learning system with three core innovations:
1. GNN Layer on HNSW
The first innovation is a Graph Neural Network layer that sits on top of the standard HNSW index.
Standard HNSW:
Query → Find approximate nearest neighbors → Return results
RuVector:
Query → HNSW neighbors → GNN re-ranking → Return results
↓
Learn from feedback
The GNN operates directly on the HNSW graph structure. Every node (vector) has learned attention weights that determine how much it should “listen to” its neighbors. These weights update based on user behavior.
use ruvector_gnn::{GATLayer, AttentionConfig};
// Configure multi-head attention over HNSW neighbors
let config = AttentionConfig {
input_dim: 128,
output_dim: 64,
num_heads: 8,
concat_heads: true,
dropout: 0.1,
leaky_relu_slope: 0.2,
};
let gat = GATLayer::new(config)?;
// Forward pass with interpretable attention weights
let (output, attention_weights) = gat.forward_with_attention(&features, &adjacency)?;
// See exactly which neighbors influenced each result
for (node_id, weights) in attention_weights.iter().enumerate() {
println!("Node {}: attention weights = {:?}", node_id, weights);
}
The GNN supports three architectures:
- GCN (Graph Convolutional Network) — learns structural patterns without manual feature engineering
- GAT (Graph Attention Network) — multi-head attention that discovers which neighbors matter most
- GraphSAGE — inductive learning that handles new, unseen vectors without retraining
All of this runs with SIMD acceleration, keeping latency under 15ms even for 100K+ node graphs.
2. SONA: Self-Optimizing Neural Architecture
The second innovation is SONA — a runtime-adaptive learning engine that tunes itself to your workload without expensive retraining.
User Query → [SONA Engine] → Model Response → User Feedback
↑ │
└─────── Learning Signal ─────────┘
(< 1ms adaptation)
Here’s what makes SONA different from traditional fine-tuning:
| SONA | Fine-Tuning | Prompt Tuning | |
|---|---|---|---|
| Adaptation speed | <1 ms | Days to weeks | Hours |
| Cost per update | $0 (local) | $1,000-$100,000+ | Engineering time |
| Downtime required | None | Yes | No |
| Learns from feedback | Automatic | Manual pipeline | Manual |
| Prevents forgetting | EWC++ built in | Risk of regression | N/A |
| Runs in browser | Yes (WASM) | No | No |
SONA uses a two-tier LoRA architecture:
┌─────────────────────────────────────────────────────────┐
│ SONA Learning │
├─────────────────────────────────────────────────────────┤
│ MicroLoRA (Rank 2) │ BaseLoRA (Rank 8-16) │
│ ───────────────────── │ ────────────────────────── │
│ Speed: ~45μs │ Speed: ~1ms │
│ Purpose: Instant fixes │ Purpose: Deep patterns │
│ When: Every request │ When: Background (hourly) │
└─────────────────────────────────────────────────────────┘
MicroLoRA handles immediate adjustments — if a user gives negative feedback, that signal propagates in microseconds. BaseLoRA consolidates patterns over time, building long-term memory of what works.
The secret sauce is EWC++ (Elastic Weight Consolidation). Traditional learning systems suffer from catastrophic forgetting — when you learn new patterns, old ones degrade. EWC++ tracks which parameters are important for each “task” and protects them when learning new tasks.
use ruvector_sona::{SonaEngine, SonaConfig};
// Create SONA engine
let engine = SonaEngine::builder()
.hidden_dim(256)
.build();
// Record a user interaction
let query_embedding = vec![0.1f32; 256];
let traj_id = engine.begin_trajectory(query_embedding);
// Record what happened (model selection, confidence, latency)
engine.add_step(traj_id, vec![0.5; 256], vec![0.8; 64], 0.9);
// Record outcome quality (0.0 = bad, 1.0 = perfect)
engine.end_trajectory(traj_id, 0.85);
// Apply learned optimizations to future queries
let new_query = vec![0.2f32; 256];
let optimized = engine.apply_micro_lora(&new_query);
Every interaction becomes a trajectory — a complete record of query → model selection → response → outcome. SONA clusters these trajectories to identify patterns (the ReasoningBank) and uses them to inform future decisions.
3. 46 Attention Mechanisms
RuVector doesn’t just support one attention pattern — it supports 46 different mechanisms optimized for different use cases:
| Mechanism | Use Case | Performance |
|---|---|---|
| Flash Attention | Long sequences | O(n) memory |
| Linear Attention | Streaming queries | Constant memory |
| Graph Attention | Relationship-heavy data | HNSW-native |
| Hyperbolic Attention | Hierarchical data | Better for taxonomies |
| Mincut-Gated | High coherence | 50% compute reduction |
| Sparse Attention | Large contexts | 10-100x faster |
The mincut-gated attention is particularly interesting. It uses graph mincut algorithms to determine when the system has “high coherence” (confident results) and can exit early:
-- In PostgreSQL: gated attention with early exit
SELECT ruvector_mincut_gated_attention(
query_embedding,
key_embeddings,
coherence_threshold := 0.85, -- Exit early if coherent
max_iterations := 10
) FROM documents;
PostgreSQL Integration: 143 SQL Functions
One of RuVector’s most practical innovations is its PostgreSQL extension. Instead of running a separate vector service, you get 143 SQL functions that bring all of this directly into your existing database.
Drop-in pgvector Replacement
-- Create the extension
CREATE EXTENSION ruvector;
-- Create a table with vector column
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding ruvector(1536)
);
-- Create an HNSW index (same syntax as pgvector)
CREATE INDEX ON documents USING ruhnsw (embedding ruvector_l2_ops);
-- Find similar documents
SELECT content, embedding <-> '[0.15, 0.25, ...]'::ruvector AS distance
FROM documents
ORDER BY distance
LIMIT 10;
But unlike pgvector, the index learns from your queries over time.
Hyperbolic Geometry for Hierarchical Data
If your data has inherent hierarchy (taxonomies, org charts, knowledge graphs), Euclidean distance is the wrong metric. Hyperbolic space naturally represents hierarchical structures with less distortion.
-- Poincare ball distance (better for trees)
SELECT ruvector_poincare_distance(a.embedding, b.embedding, -1.0)
FROM categories a, categories b
WHERE a.parent_id = b.id;
-- Mobius addition (hyperbolic translation)
SELECT ruvector_mobius_add(a, b, -1.0);
-- Convert between models
SELECT ruvector_poincare_to_lorentz(embedding, -1.0);
Full Cypher Graph Queries
RuVector includes a full Cypher query engine — the same language Neo4j uses:
-- Graph query in PostgreSQL
SELECT * FROM ruvector_cypher($$
MATCH (doc:Document)-[:SIMILAR_TO]->(related:Document)
WHERE doc.topic = 'machine-learning'
RETURN related.title, related.score
ORDER BY related.score DESC
LIMIT 10
$$);
Advanced Math in SQL
-- Optimal transport distance
SELECT ruvector_wasserstein_distance(dist_a, dist_b);
-- Spectral clustering
SELECT ruvector_spectral_cluster(embeddings, k := 5);
-- Persistent homology (topological data analysis)
SELECT ruvector_persistent_homology(point_cloud, max_dim := 2);
-- Sublinear PageRank
SELECT ruvector_pagerank_approx(graph, epsilon := 0.001);
Cognitive Containers: One File Deploys Everywhere
The RVF (RuVector Format) is perhaps the most ambitious part of the project. A single .rvf file contains:
- Vector embeddings
- HNSW index
- LoRA adapter weights
- GNN graph state
- A bootable Linux microkernel
- eBPF programs for kernel-level acceleration
- Cryptographic witness chain for tamper-evident audit
- WASM runtime (5.5 KB) for browser execution
┌─────────────────────────────────────────────────────────────┐
│ .rvf file │
├──────────────────────────┬──────────────────────────────────┤
│ 📋 Core Data │ 🧠 AI & Models │
│ MANIFEST (4 KB root) │ OVERLAY (LoRA deltas) │
│ VEC_SEG (embeddings) │ GRAPH (GNN state) │
│ INDEX_SEG (HNSW graph) │ SKETCH (quantum / VQE) │
│ QUANT (codebooks) │ META (key-value) │
├──────────────────────────┼──────────────────────────────────┤
│ 🌿 COW Branching │ 🔐 Security & Trust │
│ COW_MAP (ownership) │ WITNESS (audit chain) │
│ REFCOUNT (ref counts) │ CRYPTO (signatures) │
│ MEMBERSHIP (visibility) │ KERNEL (Linux + binding) │
│ DELTA (sparse patch) │ EBPF (XDP / TC / socket) │
│ │ WASM (5.5 KB runtime) │
└──────────────────────────┴──────────────────────────────────┘
What does this mean practically?
-
Self-boot as a microservice: Drop an
.rvffile on a VM and it boots as a running service in 125ms. No install, no dependencies. -
Run in browsers: The same file can serve queries in a browser tab with zero backend — the 5.5 KB WASM runtime handles everything.
-
eBPF acceleration: Hot vectors are served directly in the Linux kernel data path, bypassing userspace entirely.
-
Git-like branching: Create a child file that shares all parent data. A 1M-vector parent with 100 edits produces a ~2.5 MB child instead of a 512 MB copy.
-
Tamper-evident audit: Every operation is recorded in a SHAKE-256 hash-linked chain. Change one byte and the entire chain fails verification.
-
Post-quantum signatures: ML-DSA-65 alongside Ed25519 — files stay trustworthy even after quantum computers break classical crypto.
Quick Start
One-Line Install
npx ruvector install
This gives you an interactive installer. Or install directly:
npm install ruvector
Rust
[dependencies]
ruvector-core = "0.1"
ruvector-sona = "0.1"
ruvector-gnn = "0.1"
PostgreSQL
# Docker (recommended)
docker run -d --name ruvector-pg \
-e POSTGRES_PASSWORD=secret \
-p 5432:5432 \
ruvnet/ruvector-postgres:latest
# Connect
PGPASSWORD=secret psql -h localhost -p 5432 -U postgres
# Enable extension
CREATE EXTENSION ruvector;
Node.js
const { VectorDB, SonaEngine } = require('ruvector');
// Create vector database with self-learning
const db = new VectorDB({ dimensions: 384 });
const sona = new SonaEngine(384);
// Add vectors
db.add(embedding, { id: 'doc-1', content: 'Hello world' });
// Search (results improve over time)
const results = db.search(queryVector, { k: 10 });
// Record feedback for learning
sona.recordFeedback(queryVector, results[0].id, 'positive');
The Pedigree
RuVector is built by rUv and powers Cognitum, which won a CES 2026 Innovation Award for the world’s first agentic chip. The chip runs tens of thousands of AI agents at near-zero power, learning from every signal.
The project is open source (MIT license), written in Rust, and consists of 22 crates totaling 64K lines of code.
When Should You Use RuVector?
Use RuVector when:
- Your search quality matters and you want it to improve over time
- You need graph relationships alongside vector similarity
- You’re already on PostgreSQL and don’t want another service
- You want to run models locally without API costs
- You need deployment flexibility (server, browser, edge, embedded)
- Audit trails and cryptographic verification matter
Stick with simpler options when:
- You just need basic nearest-neighbor search with no learning
- Your scale is small enough that optimization doesn’t matter
- You’re prototyping and don’t need production features yet
Links
- GitHub: github.com/ruvnet/ruvector
- npm:
npm install ruvector - Crates.io:
cargo add ruvector-core - PostgreSQL Docker:
docker pull ruvnet/ruvector-postgres - Cognitum (hardware): cognitum.one
Your vector database shouldn’t be frozen in time. It should learn.