
For the last few days I have been building a project I am not going to describe.
That is the point of this post, actually. The what is mine. The how is the part worth sharing, because somewhere in the middle of a long week I stopped writing code and started supervising a team that writes it for me — and the whole team runs on machines sitting in my house.
No cloud dev environment. No SaaS issue tracker. No external CI service holding my source. A self-hosted git platform, a couple of AI subscriptions I already pay for, a mac mini that never sleeps, and a GPU box down the hall. That is the entire shop.
The shape of the thing
Here is the loop, stripped of the project it happens to be building.
I open an issue. I describe what I want in a sentence or two, the way you would drop a task on a teammate.
A planning agent wakes up, reads the issue, and goes spelunking through the codebase to figure out what the change actually requires. It writes back a scoped plan — which files, which steps, the risks, the open questions. Sometimes it asks me something genuinely good before it commits to an approach.
I read the plan. If I like it, I add a label. That label is my approval, and it is the moment the work begins.
A coding agent picks up the approved plan, makes the change on its own branch, and opens a pull request. Then — and this is the part I keep thinking about — a reviewer from a different AI family reads the diff and critiques it. Not the same model that wrote the code. A different lineage entirely. It catches the things a same-model review would wave through, because it does not share the author’s blind spots.
A testing agent runs the build and the type checks and reports pass or fail. And then the whole thing lands back on my desk with one decision left: merge, or send it back.
Plan, code, review, test. A team. Each step shows up in the project’s activity feed under its own name, so the history reads like a group of colleagues did the work. Which, in a way, they did.

Why it has to be local
I could have rented all of this. There are services that will host your repository, run your pipelines, and rent you the models. They are good. I use plenty of them elsewhere.
But this project is mine in a way that made me want the whole loop on hardware I can put my hand on. The source lives on a drive in my house. The agents authenticate against subscriptions in my name. When the GPU is grinding overnight, it is my GPU, drawing power from my wall, and the work it produces does not pass through anyone else’s logs on the way to the database.
There is a quieter reason too. When you own the whole stack, you stop asking permission. I changed a trigger at one in the morning. I rebuilt an index that was making search slow. I gave one of the agents a different model because I felt like it. None of that required a billing conversation or a rate limit or a support ticket. The friction of building dropped to almost nothing, and the only thing standing between an idea and a shipped change was my own judgment.
That turns out to be the feeling I was chasing.

The two ideas that made it click
Most of the build was plumbing — the unglamorous work of getting tokens and tunnels and triggers to behave. But two ideas did most of the heavy lifting, and they are the ones I would steal if I were you.
Make the AIs check each other. The single best decision was having one model family write the code and a different one review it. The first time the reviewer ran, it found a real bug and a bit of scope creep that the author had quietly introduced — exactly the kind of thing I would have missed at midnight. Two models from different houses disagree in useful ways. Pointing that disagreement at your own pull requests is nearly free and absurdly valuable.
Give the team a memory it maintains. The agents do not re-learn the system from raw code every time. They read a living wiki — the architecture, the conventions, the rules that must never be broken. And when the reviewer catches a recurring mistake, that lesson gets written into the wiki, so the next coder avoids it and the next reviewer enforces it. The system’s mistakes become its institutional knowledge. It gets a little smarter every time it gets something wrong.
That second one is the part that feels different from “AI helps me code.” It is closer to running a team that learns.
What it actually feels like
I expected to feel like I had more help. What I did not expect was the shift in role.
I am not the person who writes the change anymore. I am the person who decides what is worth changing, and the person who says yes. The agents are faster than me and more careful than me at midnight, and they never get bored porting the tedious thing. But they do not know what matters. That is still mine, and it turns out that is the part I actually wanted to keep.
The other thing I keep noticing: the work continues without me. I closed my laptop for the weekend and left the GPU embedding a few million records and rebuilding an index, on its own schedule, on its own hardware. Monday it will simply be done. There is something deeply satisfying about a workshop that keeps working after you turn off the lights.
I am not going to tell you what we are building. But I will tell you that for the first time, “we” is not a figure of speech.