QMD Memory Setup for OpenClaw
Give your AI agent real memory recall. QMD replaces OpenClaw's default search with a local hybrid engine — BM25 keyword search, vector semantic search, and LLM re-ranking — all running on your machine with zero API keys or cloud dependencies.
Download
⬇ Download qmd-memory-setup.skill
Install with OpenClaw:
openclaw skill install qmd-memory-setup.skill
The Problem
Your agent writes daily logs, saves preferences, and accumulates weeks of context. But ask it to recall a specific decision from three weeks ago and it either misses it or pulls up something tangentially related.
The problem isn't that the memory is gone — it's that the default search can't find it. OpenClaw's built-in SQLite search struggles when the query words don't exactly match the stored text.
What QMD Does
QMD (by Tobi Lütke) combines three search layers:
- BM25 keyword search — fast exact matching for error messages, IDs, code symbols
- Vector semantic search — finds conceptually similar content even when wording differs
- LLM re-ranking — a small local model sorts results by actual relevance
Example: your agent saved "running the gateway on the Mac Mini, port 18789." Search for "gateway server setup" — the default misses it (no "server" or "setup" in the note). QMD finds it because vector search catches the conceptual match and the reranker confirms relevance.
Quick Setup
# 1. Install QMD
npm install -g @tobilu/qmd
# 2. Add to openclaw.json
# memory.backend: "qmd"
# memory.citations: "auto"
# 3. Restart
openclaw gateway restart
Three small GGUF models (~2GB total) auto-download on first search. Everything runs locally — no API keys, no cloud.
Backend Comparison
| Feature | SQLite Default | QMD |
|---|---|---|
| Search types | Semantic only | Semantic + BM25 + LLM re-ranking |
| Query expansion | No | Yes |
| API keys needed | Optional | None (fully local) |
| Disk overhead | ~600MB | ~2GB |
| Speed | Fast (~100ms) | Moderate (~2-4s hybrid) |
What's in the Package
qmd-memory-setup/
├── SKILL.md # Full setup guide, CLI usage, SDK, troubleshooting
└── references/
└── backend-comparison.md # Detailed comparison of all three backends
Requirements
- OpenClaw — any recent version
- Node.js 22+ or Bun 1.0+
- ~2GB disk for GGUF models (auto-downloaded)
- macOS or Linux (Windows via WSL2)
- GPU optional — auto-detects Metal (Mac), CUDA (NVIDIA), Vulkan
Freedom Tech Perspective
Memory recall is where AI agents become genuinely useful — an agent that forgets is just a chatbot with extra steps. QMD solves this entirely on-device: no cloud embeddings service, no vector database subscription, no API keys leaking your notes to third parties.
- Fully local — three small GGUF models run on your machine's GPU. Your notes never leave.
- MIT licensed — open source, no restrictions.
- No accounts — install, configure, done. No signup, no telemetry.
- Built by Tobi Lütke — Shopify's CEO, built for his own agent workflow. Production-grade despite being a personal project.
- Privacy by architecture — not "we promise not to read your data," but "the data never leaves your machine in the first place."