I'm Yash.
1 prompt should be enough.
I ship AI products where success is measured in agent reliability, not feature count. Founding PM at Voltade, where I drive 0-to-1 agent platforms (Studio, Envoy) and the evaluation frameworks behind them. Five years of engineering before that.
View Projects About Me Email Me
Press βK to explore.
Featured Projects
No-code AI agent builder. Non-technical teams describe what they need in plain English and get working agents deployed across WhatsApp, Telegram, and Web.
Conversational CRM for SMEs. WhatsApp-first inbox where a per-tenant agent triages, drafts, and replies; humans approve. 100+ active SME deployments, 230K+ AI interactions/day.
App framework for AI coding agents. Bun + TypeScript + Drizzle with auth, Postgres, jobs, and an agent runtime baked in, so Claude Code gets working code on the first try.
Make any repo AI-native in one command. Detects stack, scaffolds CLAUDE.md + agents + hooks + skills tailored to the codebase.
Start here
Most agent evals measure the wrong things. After running two agents in production for six months, here's the framework I actually use, with real metrics, LLM-as-judge calibration data, and the $300 lesson that started it all.
Notes on when to use Haiku, when Opus is worth it, and why model choice is a product decision wearing engineering clothes.