Make AI agents ship code you’d actually merge.
They reimplement primitives, nest ternaries past readability, and mark screens “done” with half the tabs throwing not implemented. This is the rules, gates, and prompts — earned over a year of agent-driven production — that stop it.
Part of the AgentsKit ecosystem: build with the framework, grab ready-made agents from the Registry, ship by the Playbook, and run them in production on AKOS.
# CLAUDE.md ## Non-negotiables 1. No `any`. Schema parse every boundary. 2. Named exports only. 3. Typed AppError + stable codes. 4. createLogger(tag); no console.log. 5. ADR before architecture. RFC before breaking contracts. 6. Ship complete or don't ship. 7. Merges sum work, never subtract. 8. UI via shared primitives only. 9. Every visible string is intl. 10. Tokens; no raw color literals. ## Before you ship ``` pnpm check:quality-gates pnpm check:all ```
Whatever agent you run, it ships through the same playbook.
Drops in as the bootstrap doc for any coding agent — rules, gates, and prompts in, reviewable and shippable code out.
Sized files, named exports, no nested-ternary soup — diffs a human can actually read, understand, and approve without rubber-stamping.
RBAC, vault, signed audit ledger, and egress allowlists on by default — security designed into the contract, not bolted on after a leak.
Tests in the diff, gates green, complete-or-it-doesn't-ship — no half-built screens marked done, no stale branches grinding against main.
You’ve shipped — or caught — every one of these.
Each rule in the playbook is the fix for a specific, reproducible way agents break production code. Not theory — failure modes paid for in real repos.
Reinvented primitives
Agent rewrites a helper that already exists upstream — instead of depending on it.
Unreviewable diffs
Ternaries nested until no human can read the file, and the PR rubber-stamps through.
Deleted peer work
A merge resolved with checkout --theirs silently wipes another author's code.
Fake “done”
Screen marked shipped while half its tabs still throw not implemented.
Session amnesia
Agent repeats a mistake you already fixed last week — context lost between runs.
Stale-branch grind
Agent keeps building on a branch while main has already gone red.
Built for the kind of code agents actually ship.
Production-earned
Every rule traces to a reproducible failure mode. No theory; only patterns paid for in shipped repos.
Dual-mode docs
Each page has a Human TL;DR and a For-Agents section. Optimised for both linear reading and RAG retrieval.
Gates included
13 reference gate scripts (Node 22, zero deps) you can drop into any repo. Pure copy-paste.
Drop-in templates
ADR, RFC, PR-intent, CLAUDE.md, AGENTS.md, MEMORY.md — ready for any project.
Six pillars. Six SDLC phases. One matrix.
Each pillar carries a universal (stack-agnostic) layer plus concrete recipes for a TypeScript stack. Together they form a checklist for any agent-augmented team.
Architecture
Modular boundaries, typed contracts, ADRs / RFCs, file-size budgets, anti-overengineering, distributed data, event streaming, multi-region.
Security
RBAC, vault, audit ledger, egress allowlist, vulnerability mgmt, secrets deep, multi-tenant isolation, compliance (SOC 2 / GDPR), AI/LLM safety.
UI / UX
Design tokens, primitives catalog, intl + ICU, empty states, a11y (WCAG-AA deep), motion + reduced-motion, whitelabel, design-system governance.
Quality
Test pyramid, gates, sanity, mutation, observability + SLOs, performance budgets, chaos engineering, CI/CD pipeline, FinOps, contract testing.
Governance
PR intent manifest, merge rules, tombstones, phased PRs — keeps multi-author work additive instead of subtractive.
AI Collaboration
Bootstrap docs (CLAUDE.md / AGENTS.md), persistent memory, sub-agent recipes, slash commands, concurrent-agent survival.
Built so your agents can find what they need.
- 1
/llms.txtat root — convention for LLM-readable site map. - 2Each doc serves a raw
.mdalongside the HTML — agents fetch source. - 3Structured data (JSON-LD) and OpenGraph on every page.
- 4Single-file
llms-full.txtbundles every doc for one-shot RAG indexing. - 5Zip bundle (
/playbook-bundle.zip) for download + local indexing.
curl playbook.agentskit.io/raw/pillars/security/rbac-pattern.mdcurl playbook.agentskit.io/llms-full.txtcurl -O playbook.agentskit.io/playbook-bundle.zipTrain your agent on the whole playbook.
Paste this into Claude Code, Cursor, Windsurf, Codex — any agent. It pulls the entire playbook, audits your repo against it, and proposes a prioritized adoption plan before touching a line of code.
You are onboarding to a shared engineering playbook for shipping production software with AI coding agents. 1. Fetch and read the full playbook bundle: https://playbook.agentskit.io/llms-full.txt (Site map: https://playbook.agentskit.io/llms.txt — fetch individual docs from the /raw/ paths if you can't load the bundle at once.) 2. Then audit THIS repository against it: - Which playbook practices already hold here? - Which are missing or violated, ranked by risk (security > correctness > quality > governance > DX)? - Which are not applicable to this stack, and why? 3. Propose a short, prioritized adoption plan: the 5 highest-leverage changes for this repo, each with the playbook doc it comes from and a concrete first step. 4. Draft (or update) the repo's bootstrap doc — CLAUDE.md, AGENTS.md, .cursor/rules, .windsurfrules, or .github/copilot-instructions.md as appropriate for the agent in use — using the playbook's template as the starting point. Do not change code yet. Output the audit and the plan first, then wait for my go-ahead.
One workflow. Four parts that fit together.
AgentsKit builds it, the Registry gives you a head start, this Playbook keeps it shippable, and AKOS runs it in production. Same standards end to end.
AgentsKit
Build the agent, skip the plumbing. Chat UI, runtime, tools, memory, and RAG in one JavaScript toolkit.
Registry
The shadcn for agents. Copy production-ready agents straight into your project — no boilerplate.
Agents Playbook
You're here. The engineering discipline that keeps agent-built code reviewable, safe, and shippable.
AgentsKit OS
Orchestrate and govern agents in production — identity, audit, permissions, and cost control.
Start with the eight non-negotiables.
The kernel of the playbook. If an agent breaks one, fail the PR. Everything else flows from these.
Free and open (CC-BY-4.0). If the playbook saved you a code review, drop a star ↗ — it helps other teams find it.
Want the platform these practices run on? Explore AgentsKit ↗