← Blog

What I stole from gbrain (and ported to OpenClaw in one afternoon).

Contents
  1. What gbrain is actually for
  2. 1. The durable job queue
  3. 2. `openclaw doctor --resolve-all`
  4. 3. Storage tiering
  5. 4. Hot memory: `openclaw recall`
  6. 5. Compiled truth + timeline
  7. What I didn't port
  8. What changed

I run Clawrence and Claudia at Voltade. They've been live in production for months. Most days they just work. The days they don't are almost always the same shape: a cron silently broke, an edit to a memory file didn't take effect until I restarted the gateway, or a script got renamed and three things downstream stopped referring to it.

These aren't model problems. They're harness problems. So when Garry Tan open-sourced gbrain, the memory and orchestration layer behind his own OpenClaw setup, I wanted to know what he'd solved that I hadn't.

I spent an afternoon reading the repo, picked five patterns that mapped to actual pain I'd written down in memory files, and ported each one. Total time: about 5 hours. Net result: my agents have a layer of reliability they didn't have yesterday.

#What gbrain is actually for

Gbrain is a knowledge layer for AI agents. Postgres + pgvector underneath, with a typed knowledge graph, a hybrid search pipeline, and a job queue Garry calls "Minions" for durable background work. It's serious infrastructure: 3,700 unit tests, a benchmark suite, 17,888 pages running in his production instance.

I don't need most of it. My agents don't ingest meetings or run dream cycles. But several of gbrain's design moves solve problems I'd been working around. Five of them ported cleanly:

  1. A durable SQLite job queue
  2. A doctor --resolve-all config auditor
  3. Storage tiering between git-mirrored knowledge and raw firehose
  4. A hot-memory recall CLI
  5. A "compiled truth + timeline" memory file structure

#1. The durable job queue

In gbrain it's called Minions. The pitch: every "thing that should happen even if the machine reboots" is a row in a Postgres table, not a process. A worker claims, dispatches, marks done, or retries with backoff.

My version is the same shape but with SQLite instead of Postgres (one user, one machine, no need for the heavier dep). About 200 lines of Python. Schema:

CREATE TABLE jobs (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  kind TEXT, payload TEXT, run_at INTEGER,
  status TEXT, attempts INTEGER, max_attempts INTEGER,
  last_error TEXT, enqueued_at INTEGER,
  started_at INTEGER, finished_at INTEGER,
  worker_pid INTEGER, idempotency_key TEXT UNIQUE
);

The worker is a single LaunchAgent that polls every 5 seconds, claims one job atomically, dispatches to a handler in ~/.openclaw/scripts/queue-handlers/<kind>.sh, and either marks done or retries with exponential backoff. Stuck in_flight rows older than 10 minutes get reclaimed on worker startup, on the assumption the previous worker died mid-job.

I wired the WhatsApp chaser (Claudia's nag-when-customers-are-waiting cron) as the first user. Previously when a Telegram send failed transiently, the chase just got logged and dropped. Now it gets enqueued with an idempotency key derived from group JID + timestamp, and the worker retries it for up to four attempts before giving up.

Tests went green. Then I wrote my own smoke test: enqueue a fake job, watch the worker pick it up, verify the handler runs. I picked chaser-send.sh as the handler and passed INVALID_FAKE_JID_FOR_TEST@g.us as the JID, assuming the script would reject it.

It didn't. chaser-send.sh posts to a hardcoded Telegram thread regardless of JID. My "queue smoke test, safe to ignore" message landed in the Voltade Team Ops thread.

I posted a follow-up explaining what happened and added a note to myself: handlers that hit external systems need a DRY_RUN env var before any smoke test goes near them. The queue itself worked exactly as designed. I just hadn't read what the handler actually did.

#2. `openclaw doctor --resolve-all`

Gbrain has gbrain check-resolvable. The idea is that broken cross-references shouldn't wait until runtime to bite you. The pre-flight should walk every ref and tell you what's missing.

My OpenClaw setup is full of cross-references that have silently broken on me before:

  • LaunchAgent plists point to scripts. I rename a script, plist points at nothing, cron silently no-ops.
  • The cron jobs.json payloads reference bash <path> commands. Same problem.
  • The agent system prompts mention script paths. Same problem.
  • bootstrap-extra-files in openclaw.json only accepts 8 hardcoded basenames; anything else is silently dropped at runtime. I learned this the hard way and wrote it into memory as feedback_openclaw_bootstrap_limits.
  • Workspace MEMORY.md files have a 12KB cap (or whatever bootstrapMaxChars is set to). Anything past that is end-truncated, silently.

The doctor walks all of these. Exits non-zero on any error. Runs on a 9:15am daily cron and DMs me via Clawrence if anything's broken.

First run on production: 0 errors, 1 warning. The warning was workspace-main/MEMORY.md is 31517 bytes, 98% of cap 32000. I would not have noticed this until the day a new memory entry pushed me over and Claudia's prompt started getting truncated mid-rule. That's exactly the kind of regression I want to catch in the morning, not at 2am from a customer.

#3. Storage tiering

In gbrain (v0.22.11) the move is: bulk content (transcripts, tweets, articles) stays in the database. Curated knowledge goes in git. Don't mix them.

My equivalent: a backup repo at ~/openclaw that mirrors my live .openclaw setup via an explicit allow-list in sync.sh. Already an allow-list, so the discipline was mostly in place. But two things had crept in:

  • The local auto-sync.log had grown to 29MB with no rotation.
  • One .log.archive-20260506 file from a manual archive had been committed back in May and never untracked.

Tiny stuff individually. Together they were the entire reason du -sh ~/openclaw reported 60MB when the actual mirrored content is 2MB.

Fix: rotate the log when it crosses 5MB. Untrack the committed archive. And add a defense-in-depth deny-list to the copy_glob helper so any .log, .lock, *-state.json, or *credentials* file gets skipped even if a pattern matches it.

case "$(basename "$f")" in
  *.log|*.log.*|*.lock|*-state.json|*-credentials*|*.token|*.key|*.pem)
    skipped=$((skipped + 1))
    continue
    ;;
esac

Belt and suspenders. The .gitignore already covers most of this. But the .archive-20260506 slipping past it taught me to fail at the copy step too.

#4. Hot memory: `openclaw recall`

Gbrain's gbrain recall lets the agent ask its own memory mid-conversation. This solves a specific pain in my setup: OpenClaw loads MEMORY.md into the agent's system prompt at session start, then caches it. If I edit the file, the agent doesn't see the change until I restart the gateway. I've been bitten by this enough times that I have a feedback memory called feedback_system_md_restart.

My port is a 100-line Python CLI:

$ openclaw recall "voltade working hours"
β†’ .claude/projects/-Users-yash/memory/project_voltade_working_hours.md:7:
  Voltade working hours: Monday to Friday, 10:00 AM to 6:00 PM SGT.

It greps across every workspace MEMORY.md, agent system.md, and my user-level memory directory. Case-insensitive substring match with surrounding context. Has a --json mode for when an agent calls it as a tool (instead of restarting the gateway, the agent just calls recall and gets fresh data).

11 unit tests. Catches edge cases like "workspace filter", "user memory dir included", "JSON mode exits non-zero on no hits". The whole thing took 20 minutes.

#5. Compiled truth + timeline

This is the design move that took the longest to get right. In gbrain, every knowledge page has two sections separated by a divider: above is the current best understanding (rewritable, kept tight), below is an append-only event log.

My workspace-clawrence/MEMORY.md had grown to 211 lines of mixed-mode content. Rules and incidents and dates and protocols, all interleaved. Editing a rule meant reading the whole file to make sure I didn't accidentally contradict a note from three months ago.

I restructured it. Top half is now Clawrence's current state: who he is, what he does, what his guardrails are, what crons he owns. Bottom half (after the --- divider) is a dated timeline of incidents that explain why the rules above exist. The 2026-05-12 incident where someone wrote "lawrence will handle X" and Clawrence misread it as "clawrence" and auto-acked? That story now lives in the timeline. The rule it produced ("never assume a handoff to another agent based on chat text") lives at the top.

Two wins from the restructure:

  1. File shrank from 19.6KB to 15.6KB. Twenty percent leaner with no information loss. That's headroom against the bootstrap cap.
  2. The "why" stays available. I can edit a rule without losing the incident that justified it. Future-me reading the timeline can decide whether the rule still applies or whether the underlying reason has changed.

There was a tax. After I wrote the new file, I ran the recall tool against it and noticed an em-dash on line 18. Then realised the file is full of em-dashes. Then realised Clawrence's own voice rule, sitting on line 40 of the same file, says "NEVER use em-dashes".

I'd violated the agent's own rule in its own memory. I ran a sed script that preserved the rule line itself and replaced the other 26 em-dashes with commas. Funny mistake. Exactly the kind of thing the daily doctor cron is supposed to catch eventually.

#What I didn't port

A few gbrain patterns I deliberately skipped:

  • Vector + RRF hybrid search. Overkill for my data volume. My memory files fit in one ripgrep call.
  • Typed knowledge graph. Customer ↔ deal ↔ invoice already lives in Notion and Xero. Building a parallel graph competes with the source of truth.
  • Dream cycle (overnight LLM synthesis). Tempting, but I've made a deliberate choice for OpenClaw to run deterministic over LLM-as-judge. A nightly LLM consolidation is the opposite of that.
  • Voice ingestion. WhatsApp is the channel. Phone is out of scope.

#What changed

After 5 hours of work:

Before After
Cron failure = silent drop Cron failure = enqueued, retried, DM'd on final-fail
Bootstrap silent truncation only detected by missing behavior Daily 9:15am cron warns me before it happens
MEMORY.md edits required daemon restart to take effect Agents can recall mid-conversation
MEMORY.md was a 20KB pile MEMORY.md is 15KB compiled state + a 4KB event log
du -sh ~/openclaw = 60MB du -sh ~/openclaw = 32MB
1691 tests 1702 tests (queue + doctor + recall added)

Plus one new feedback memory: handlers that hit external systems need a DRY_RUN flag before any smoke test goes near them.

The agents themselves didn't change. They still answer the same way, run the same crons, talk to the same customers. What changed is the floor underneath them got harder. The next time a cron silently drops, a memory edit doesn't take, or a script gets renamed without its callers being updated, I'll find out about it from a DM. Not from a customer.

If you're running your own OpenClaw setup, gbrain is worth a read even if you don't use any of it directly. Garry's solved a lot of the harness problems that are easy to dismiss as "engineering chores" until they bite you in production.

The full diff of my changes is in yash-gadodia/openclaw (private, but happy to share if you ping me).