I Let a Client Vibe Code Our Production App. Here's What Happened.
It started as a pleasant surprise. Our client, Marcus, a logistics company founder with zero programming background, sent a pull request. A real one. He had used an AI coding assistant to vibe code a shipment tracking dashboard into the app we were building for him. The feature worked. It looked decent. He was thrilled. "I built this in two hours," he said on our next call. "Do you know how long I've been waiting for this?"
I did know. It had been sitting in our backlog for three sprints. We had been focused on the data pipeline and API layer, the invisible but critical infrastructure his business needed. Marcus did not care about infrastructure. He cared about the dashboard. And now he had it.
So I said something I would come to regret: "That's great, Marcus. Keep going."
The Honeymoon
For two weeks, Marcus was the most productive member of the team. He shipped a notification system. He added CSV export to three different reports. He built an internal admin panel we hadn't even scoped yet. Every morning I would wake up to a new pull request with a cheerful message: "Added the thing we talked about last month."
The features worked. They did what they were supposed to do. My team was impressed, if a little uneasy. Our junior developer said it best: "It feels like someone moved into our house while we were sleeping and rearranged the furniture. Everything is technically in a room, but nothing is where it should be."
The Cracks
The problems did not arrive all at once. They accumulated.
First it was performance. Pages that loaded in 200 milliseconds started taking four seconds. Marcus's dashboard was making seventeen separate API calls on mount, several of them redundant, each fetching entire tables when it needed a handful of fields. He did not know about our GraphQL layer. The AI had generated REST calls to endpoints that existed but were never meant to be used that way.
Then it was the duplication. Marcus needed date formatting in several places, so the AI gave him date formatting in several places -- six different utility functions spread across nine files, each with slightly different behavior. One of them had a timezone bug that only surfaced on the last day of the month. We found it in production.
The admin panel was the real problem. It bypassed our permission system entirely. Not maliciously. Marcus did not know we had a permission system. The AI did not ask. It just built the most direct path from intent to implementation, which happened to skip authentication middleware and talk directly to the database layer through an ORM wrapper it generated from scratch rather than using the one we already had.
By the end of week three, Marcus had contributed roughly ten thousand lines of code. None of it had tests. All of it was in production.
The Conversation
How do you tell a client that the code they are proud of, the code that genuinely works and delivers real business value, is also slowly poisoning the codebase?
I rehearsed the conversation three times. I did not want to be dismissive. Marcus had shown real initiative and the features themselves reflected a sharp understanding of his business needs. The problem was not his intent or even his output in isolation. The problem was that code does not exist in isolation — the same prototype-to-production gap that trips up even experienced developers.
I ended up using an analogy. "Imagine you hired a contractor to build a house," I said. "And your neighbor, who is very enthusiastic and owns a nail gun, starts adding rooms. The rooms are fine. They have walls and doors. But they're not connected to the plumbing. They don't meet code. And the contractor now has to figure out how to integrate five rooms they didn't plan for, or tear them down and rebuild."
Marcus was quiet for a moment. "So my rooms are bad."
"Your rooms are useful. But they need to go through the same process as every other room."
What We Changed
We did not ban Marcus from coding. That felt wrong and probably futile. Instead, we set up a structure.
First, we created a sandbox environment. Marcus could build whatever he wanted there. It was a full copy of the app with a separate database, no connection to production. This gave him the freedom to experiment without the risk of breaking things for actual users.
Second, we required code review for anything moving to production. Not as a gatekeeping exercise, but as a genuine collaboration. Our developers would take Marcus's AI-generated features and work with him to integrate them properly: connecting to existing services, adding tests, removing duplication, respecting the architecture. Marcus learned a lot from these reviews. He started asking the AI better questions because he understood more about the system.
Third, we wrote a short architecture guide — essentially a CLAUDE.md file. Not for us. For the AI. Marcus would paste it into his context window before starting a session. It described our permission model, our data access patterns, our existing utility libraries. The quality of the AI output improved meaningfully.
The Bigger Question
Marcus is not unique. The barrier to generating functional code has effectively collapsed. Clients, product managers, designers, founders -- anyone who can describe what they want in plain language can now produce something that runs. This is genuinely powerful and I do not think the genie is going back in the bottle.
But "it runs" has never been the hard part of software. The hard part is making it run alongside everything else. The hard part is making it run six months from now when the requirements change. The hard part is making it run at three in the morning when the on-call engineer, who did not write it, needs to figure out why it stopped.
The lesson from Marcus is not that non-technical people should stay away from code. It is that code generation without architectural context produces technical debt at an unprecedented rate. The AI will give you exactly what you ask for. It will not tell you what you forgot to ask about.
We need new workflows for this reality. Code review is no longer just a developer-to-developer practice. Architecture documents are no longer just for the team. The separation between "who can write code" and "who understands the system" is now a gap that tooling and process have to bridge.
Marcus still submits pull requests. They are better now. Last week he even added tests. He told me the AI suggested it.