Blog | Yash Gadodia

Pinned · June 19, 2026 · 6 min read

Model safety ends where the customer's WhatsApp begins

Model-layer safety is the lab's job. Deployment-layer safety is mine: pointing a safe model at a real customer's WhatsApp without getting burned.

Pinned · June 12, 2026 · 5 min read

The system behind 230K interactions a day

The two numbers on my CV, unpacked: what counts as an interaction, which layer the 99.65% is measured at, and the eval, guardrail, and cost subsystems that keep both honest.

Pinned · June 10, 2026 · 4 min read

Creating a tenant is an INSERT

Voltade's third swing at the same problem. Studio asked SMEs to build, Vobase had us building for them, and Volty bets they'll configure. What changed each time, and why.

Pinned · May 16, 2026 · 11 min read

How I keep production agents on the rails

Production agent safety is not about jailbreaks. It is about an agent confidently doing the wrong thing on a customer's WhatsApp for six hours before anyone notices. Here are the failure modes that actually happen, the guardrails that work, and how I prove the guardrails are working.

Pinned · March 15, 2026 · 12 min read

How I Evaluate AI Agents (and Why Most Teams Get It Wrong)

Most agent evals measure the wrong things. After running two agents in production for six months, here's the framework I actually use, with real metrics, LLM-as-judge calibration data, and the $300 lesson that started it all.

July 14, 2026 · 3 min read

Selling Trust

Three things I've learnt closing enterprise AI deals at Voltade: you're selling trust, likability means being yourself, and the moment the customer starts selling to you.

July 7, 2026 · 5 min read

Handing a build to the Mac Mini over SSH and tmux

I asked the Claude session on my MacBook to hand an app build to my Mac Mini. It set up the machine over SSH, launched a second Claude in tmux, and the two now coordinate through git.

June 28, 2026 · 6 min read

The Trust Budget

Autonomy isn't a property of the agent. It's a budget you allocate per action, priced by how reversible the action is and how big the blast radius gets. Here's the framework I use across every agent I run.

June 22, 2026 · 5 min read

How an agent earns the right to change itself

An agent that can rewrite its own behaviour can rewrite it wrongly. This is the machinery that decides which of its proposed changes land on their own, and which have to wait for a human.

June 16, 2026 · 8 min read

Agents that file their own bug reports

An 8am cron reads 24 hours of my AI agents' logs through four lenses and DMs me a prioritised list of what to fix. Plus the open-source self-healing harness underneath it.

June 12, 2026 · 4 min read

The 24-hour window versus the approval queue

Volty agents draft, humans approve. WhatsApp gives you 24 hours to reply. What happens when the approval is slower than the window, and the keep-alive sweep we built for the collision.

June 1, 2026 · 4 min read

Why Studio didn't work

We built a no-code agent builder so SMEs could build their own agents. Almost nobody did. Why Studio failed, and what we built instead.

May 24, 2026 · 4 min read

How I pick what to ship between Envoy and Studio

The mechanic I actually use to prioritise across two 0-to-1 AI products as the only PM at Voltade. The Studio pivot, the Thursday rule, and the call I got wrong.

May 20, 2026 · 5 min read

What an applied AI lab in Singapore should build first

Eighteen months building applied AI for SEA SMEs at Voltade. Here's what I'd build first if I were starting Singapore's new applied AI lab.

May 19, 2026 · 10 min read

What I stole from Hermes (and ported to OpenClaw in an afternoon).

Nous Research open-sourced Hermes Agent, a harness in the same shape as OpenClaw. I read it for an afternoon and ported four patterns: a usage tracker, trajectory compression, a stdio MCP server, and an error classifier in the claude-cli shim. Two more I deferred. Two I rejected outright.

May 17, 2026 · 8 min read

What I stole from gbrain (and ported to OpenClaw in one afternoon).

Garry Tan's gbrain has a few design moves that solve real OpenClaw pain. I ported five of them: a durable job queue, a config doctor, storage tiering, a hot-memory recall CLI, and a compiled-truth memory structure. Here's what worked, what didn't, and what almost leaked to a customer channel.

May 15, 2026 · 13 min read

How Vobase agents learn (and why SME owners notice)

A walk through the self-learning loop inside Vobase: wake events, staff signals, change proposals, applied skills. The technical mechanism that turns adaptive software from a slide into a thing SME owners actually trust.

May 15, 2026 · 12 min read

Seven agents on a Mac Mini: four months of breaking my OpenClaw harness

I run seven personal AI agents on a Mac Mini in my flat. They book my gym classes, manage my inbox, handle outreach for two academies, and watch Voltade customer groups. The harness took four months to stabilise. The two things that fixed it were scope discipline and deterministic flows. Here is what each agent does, what broke, and the patterns that finally stuck.

May 11, 2026 · 10 min read

OpenClaw, six months later: what production actually demanded

I did a full stability audit of my OpenClaw stack today. Here's what changed once it stopped being a personal toy and started serving real customers and a real team.

May 10, 2026 · 7 min read

An agent is staff, not magic

Notes on writing a model behaviour spec for an agent that works for someone. What it does, what it refuses, who it serves when those conflict, and why most system prompts are a lazy substitute for a job description.

May 10, 2026 · 6 min read

Cost is a product feature

Notes on when to use Haiku, when Opus is worth it, and why model choice is a product decision wearing engineering clothes.

May 9, 2026 · 4 min read

I had Claude audit my Claude Code use

82 sessions, 12 weeks, 4,888 Bash calls. I pointed an audit tool at my own Claude Code transcripts. Three things came back that I didn't want to hear.

April 24, 2026 · 7 min read

The 20% That Is the Business

Templates handle the common 80%. The remaining 20% is every customer's actual business. Here's what that 20% looked like for one bakery, one day, five bugs.

April 20, 2026 · 6 min read

Adaptive Software

Fixed SaaS loses to models. Blank-canvas AI builders lose to the cold start problem. What wins is 80% ready software that the business shapes by talking to it.

March 17, 2026 · 11 min read

claude-init: Make Any Repo AI-Native in One Command

Every time you open Claude Code on a new repo, you start from zero. claude-init fixes that by analysing your codebase and generating a complete .claude/ configuration.

March 12, 2026 · 4 min read

Teaching Claude How to Write Like Me

I built a Claude Code skill that writes blog posts in my voice from any session. The hard part wasn't the workflow, it was the voice.

March 11, 2026 · 8 min read

Envoy CRM: What I Learnt Getting 20 SMEs to Actually Use a CRM

How we built Envoy from zero to 20+ paying SME customers in Singapore. Dogfooding, grant-driven GTM, killing features, and the product decisions behind it all.

March 11, 2026 · 14 min read

How This Site Was Built With Claude Code

This entire website was built, redesigned, and maintained by Claude Code. Here's every prompt, tool, and workflow behind it.

February 22, 2026 · 4 min read

WIMAUT: Because Your Agents Won't Tell You They're Burning $300

I built an agent observability dashboard after an OpenClaw cron job silently burned $300. Here's the problem and what WIMAUT does.

January 18, 2026 · 4 min read

Self-Hosting an LLM on a Mac Mini

How I set up a local LLM on a base-model Mac Mini, put it behind a public URL with Cloudflare Tunnels, and locked it down with Cloudflare Access. No port forwarding required.

December 8, 2025 · 8 min read

Clawrence and Claudia: Building AI Agents That Actually Do Things

I built two OpenClaw agents for Voltade. One manages our customers. The other runs our internal ops. Here's what I learned about deployment, models, and what agents are actually good for.

October 15, 2025 · 5 min read

The Death of SaaS, and What Comes After

Software's marginal cost is approaching zero. SaaS as a business model is dying. The winners will be AI-native services firms, not tool makers.