The real AI coding skill isn’t prompting. It’s architecture.

How architecture-driven interaction beats vibe coding in production.

There is now a clear split in the way developers use AI.

On one side, there is vibe coding: fast, seductive, and often impressive in the first ten minutes. On the other, there is a more disciplined approach in which AI is used inside explicit architectural boundaries. The difference matters because production software is not judged by how quickly code appears on screen, but by whether that code can survive integration, change, review, and operation.

One of the main practical limits of AI coding is not raw code generation itself. It is the model’s ability to operate with the right context at the right level. Feeding an entire codebase to a model is infeasible in practice. Useful generation depends on recovering the relevant structure, dependencies, and neighboring files, not on a local prompt alone. Current research in repository-level code generation confirms this: results improve significantly when the model receives structural context from the repository and relevant cross-file information, rather than treating the task as an isolated prompt. This is what some now call context engineering: the discipline of selecting, organizing, and exposing the right subset of system knowledge to the model.

This is where the current discourse often goes wrong. The question is not: “Can AI write code?” Of course it can. The real question is: “How should a developer interact with AI so that the resulting code remains governable inside a real system?”

My answer is simple: through architecture.

Effective context engineering for AI agents

Vibe coding is abdication

I have audited systems where AI-generated code had already reached production. The recurring pattern was not just poor code quality. It was loss of control. Responsibilities were misplaced. Boundaries were blurry. Similar logic appeared in multiple places. The code sometimes worked locally, but it did not belong anywhere clearly in the system. Nobody had formalized which component owned which responsibility, so when things broke, blame traveled to whichever layer was most visible, not whichever layer was actually failing.

https://medium.com/media/761cbcaff22db8442a5c55c656effe94/href

During a discussion with a senior engineer, he used the right expression:

vibe coding is abdication

I think that is exactly right.

When a developer vibe codes, they progressively hand over structural decisions to a system that cannot truly hold the whole software reality in its working memory. That is not a moral failure. It is a category error. We ask a probabilistic generator to perform a role that belongs to engineering judgment.

https://medium.com/media/3b5e955fbd717703a9d8b047af29bfdc/href

And this is precisely why vibe coding collapses in production. Production code is not just syntax plus business intent. It is code placed in the correct module, behind the correct contract, with the correct dependencies, lifecycle, and operational implications. AI can generate fragments quickly. It does not guarantee by itself the right distribution of responsibilities across the system.

Addy Osmani (Director, Google Cloud AI) :

Agentic Engineering

The real question is task decomposition

Once we accept that AI cannot safely absorb a large software system as a whole, the central problem becomes obvious: how do we cut the work?

This is not a new problem. Developers have always had to decide how to organize code. Architects have always had to decide where responsibilities belong, what depends on what, and which boxes are allowed to know each other. In other words, decomposition was already part of software engineering before AI. AI simply turns it into the decisive skill.

I saw this concretely when designing an automated log analyzer for our engineering leadership. The initial proposal from the team was a monolithic agent: one LLM call analyzing logs, generating code corrections, and creating merge requests in a single flow. It was ambitious and completely unshippable. We decomposed it into two distinct services: an orchestrator responsible for intent analysis and task routing, and a local agent responsible for code execution via a CLI tool wrapped in a subprocess. Each service had a single responsibility, a clear contract, and could be developed and tested independently. The architecture was not elegant by academic standards. It was boring. It shipped.

So the alternative to vibe coding is not “better prompting” in the narrow sense. It is disciplined decomposition.

https://medium.com/media/28b58466d1331ebaedca4505b08a2001/href

What developers must provide to AI

If architecture is the answer, then what exactly must the developer contribute?

First, boundaries. The model must know the module it is working in, what that module owns, and what it must not own.

Second, responsibility. The model must know whether it is changing orchestration, domain logic, persistence, UI composition, or integration code.

Third, allowed patterns. A system that follows package-by-feature, hexagonal architecture, feature-first frontend modules, or event-driven boundaries should expose those rules explicitly. Otherwise the model will fill the gaps with generic defaults.

I have seen this play out concretely in a modulith architecture organized by feature. Each feature package contained its own three-tier structure: controller, service, persistence. When a new AI coding tool was introduced, developers who pointed it at a specific feature package got coherent, well-placed code, because the boundaries were already encoded in the folder structure. Developers who asked the model to “add a feature” without specifying the target module got code scattered across packages, with dependencies pointing in the wrong direction. Same model, same prompt quality, radically different outcomes. The difference was whether the architecture was exposed to the model or not.

Fourth, invariants. Some things are not style; they are constraints. A session store may be global. A feature store may not. A domain service may call a port, but not a controller. If those invariants are absent from the context, the model will happily violate them.

This is why I increasingly think the key AI coding skill is not prompt cleverness. It is the ability to expose architectural intent in a form the model can execute against.

Addy Osmani (Director, Google Cloud AI) :

My LLM coding workflow going into 2026

Where AI actually helps, and where it does not

The weak zone is the deceptively medium-sized task: not small enough to be local, not large enough to trigger explicit architecture work, but broad enough to require hidden knowledge of the system. This is exactly where teams are tempted to “just ask the AI,” and exactly where structural debt starts accumulating silently.

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

LLMs are genuinely useful when the task is tightly scoped and success conditions are explicit, because the context can be kept small and the target is clear. They are also useful in ideation and exploration, where the goal is to expand the option space rather than commit directly to implementation: comparing designs, challenging assumptions, surfacing options the developer had not considered.

I experienced this firsthand when a client proposed an over-engineered SaaS architecture with separate billing, IAM, and payment flows for every customer segment. Using AI to explore alternatives, I quickly converged on a simplified three-pillar stack: one IAM provider, one billing engine, one payment processor. That covered all segments with a fraction of the complexity. The AI was not making the architectural decision. It was accelerating the exploration of the option space. The judgment of what to keep and what to discard remained mine.

But between those two poles, the small scoped task and the strategic exploration, there is a dangerous middle ground. A feature that touches three modules. A refactoring that looks local but has hidden coupling. A bug fix that requires understanding an implicit invariant. In that zone, AI without architectural guidance produces code that compiles, passes local tests, and quietly degrades the system.

https://metr.org/blog/2026-02-24-uplift-update/

Architecture-driven interaction

So what does architecture-driven interaction actually look like?

It means the developer does not ask the model to “build a feature” in the abstract. They provide a target module, a role, a dependency policy, a naming convention, a placement rule, and the specific files or contracts that define the local architectural truth.

It also means the developer uses AI differently at different levels:

for exploration, to compare designs, challenge assumptions, and surface options;
for implementation, to execute inside explicit boundaries;
for refactoring, to move code toward a target structure that has already been chosen;
for review, to detect boundary violations, misplaced responsibilities, and hidden coupling.

https://medium.com/media/376dfb48478282dec79300bac7462acb/href

In that model, architecture is not documentation after the fact. It becomes the control surface through which human judgment governs machine generation.

To be clear: architecture-driven interaction does not replace tests, code review, or deep knowledge of your frameworks and languages. It does not excuse sloppy engineering. What it changes is the order of priorities: it puts structural clarity before generation speed, and treats the model as a constrained executor rather than an autonomous designer.

Conclusion

The problem with vibe coding is not that AI writes bad code. The problem is that vibe coding removes the discipline that makes code belong to a system.

If your team uses AI without having formalized its boundaries, it is accumulating structural debt faster than at any point in the history of software. Every unscoped prompt is a small abdication. Every generated file placed in the wrong module is a future incident.

If we want AI-generated code to survive production, we need to stop treating prompting as the core skill. The real skill is deciding how the system is cut, where responsibilities live, which constraints are non-negotiable, and how to expose that structure to the model.

LLMs generate code. Architecture makes that code survivable.

Stop promptiong, start thinking.

I am CTO at SCUB, a French IT services company, and AI Ambassador for the French Ministry of Economy (“Osez l’IA”). I design production AI systems, contribute to open-source tooling, and write about the intersection of architecture and AI-assisted development.

My latest project is Docling Studio, a visual inspection layer for document parsing pipelines built on IBM’s Docling ecosystem.

Thank you for reading! If you found this article useful, please feel free to 👏 and help others find it. I welcome your thoughts in the comments section below.

The real AI coding skill isn’t prompting. It’s architecture. was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.