Agentic Coding in a Production Monorepo: The Harness Matters More Than the Model
How I actually use coding agents like Claude Code and Codex day to day - and why guardrails, not prompts, are what make them safe in a real codebase.
This portfolio's repository has more coding-agent configurations in it than human contributors. That's not an accident - it's the result of a year of treating agentic coding tools as part of the toolchain instead of a novelty, and figuring out what it actually takes to let them touch a codebase you're responsible for.
Here's where I've landed.
The Mental Model: A Junior Engineer with Infinite Stamina
A coding agent is not a search engine and it's not an autocomplete. The model that works for me: it's a junior engineer with unlimited energy, a decent memory of every API ever written, and zero institutional knowledge of your system.
You wouldn't let a new hire push to main on day one. You also wouldn't waste them on nothing but boilerplate. The whole game is building the environment where their output is verifiable.
The Harness Matters More Than the Model
The single biggest lesson: the quality of agent output tracks the quality of your guardrails, not your prompts.
In a production monorepo - mine is TypeScript end to end, with tRPC routers, Drizzle schemas, and a lot of React - the agent operates inside the same fences the humans do:
- Type-checking is the first line of defense. A strict
tsconfigcatches a huge class of agent mistakes before a human ever reads the diff. End-to-end type safety through tRPC means an agent can't quietly change an API contract without the compiler complaining somewhere. - Dangerous operations stay gated. Database migrations are the classic example - schema changes need senior review no matter who (or what) wrote them. CI enforces this; the agent doesn't get a special lane.
- PR pre-flight checks run on everything. Lint, tests, build, dependency checks. Agent-authored code goes through the exact same pipeline as human-authored code. No exceptions, because exceptions are where incidents come from.
If your repo already has good bones - types, tests, CI - agents slot in shockingly well. If it doesn't, agents will happily generate plausible-looking code that fails in ways you won't notice until production. The agent amplifies whatever engineering culture already exists.
What I Actually Delegate
Where agents earn their keep for me:
- Mechanical refactors with a verifiable endpoint - renames across a large surface, migrating a pattern, converting components to a new API. The type-checker proves completion.
- First drafts of well-specified features - when I can describe the behavior and point at an existing pattern in the codebase to imitate.
- Test scaffolding - agents are tireless about edge cases humans get bored writing.
- Codebase archaeology - "where is this actually used, and what breaks if I change it?" is something an agent answers faster than grep-and-guess.
Where I don't: anything architecturally load-bearing, anything security-sensitive, and anything where the requirements live in my head instead of in writing. Writing the requirement down first isn't overhead - it's the same discipline that makes human delegation work.
Context Is the Real Bottleneck
The frustrating failures are rarely capability - they're context. The agent didn't know about the internal convention, the deprecated module, the reason that weird workaround exists.
Two things help:
- A
CLAUDE.md/ agent-instructions file in the repo. Conventions, commands, footguns. Every agent that lands in the repo reads it first. It's onboarding documentation, and just like with humans, writing it once pays off every session after. - MCP (Model Context Protocol) servers for the systems the agent can't see - issue trackers, databases, deployment platforms. Instead of pasting context into a prompt, the agent queries it. This is the piece I expect to matter most over the next few years: the tools are converging, but the context plumbing is where teams differentiate.
The Automation Layer Around the Code
Not everything is code generation. The other half of "agentic" for me is workflow automation - and running n8n self-hosted (it fits naturally into a homelab that already runs Coolify) covers the glue: notifications routed by rules, scheduled jobs, webhooks between services that would otherwise need a cron job and a prayer.
The dividing line I use: if it needs judgment, it goes through an agent I review. If it's deterministic, it goes in n8n and I stop thinking about it.
Honest Verdict
Agentic coding hasn't replaced anything I'd call engineering. It has replaced a lot of what I'd call typing.
The engineers getting the most out of these tools aren't the ones writing the cleverest prompts - they're the ones with the cleanest repos: typed, tested, documented, CI-gated. That was always good engineering. It just pays double now.