
When an AI coding agent writes backend code, it works in a loop: it writes something, runs whatever feedback is available, reads the result, makes a change, and repeats until nothing complains. Once the feedback goes quiet it treats the task as finished and stops, and whatever problems hadn't surfaced by then are problems it has shipped.
This means the quality of what an agent produces depends more on the feedback it gets while it's still working than on how capable the underlying model is. A better model writes a better first attempt, but it still stops when the signals it can see have gone green, and a mistake that produces no signal the agent can read is one it has no way to find and no reason to fix.
Type errors are the strongest signal a backend can give an agent in that loop. They arrive at compile time, they point at a specific line, and they say what was expected and what they got instead. They're also the cheapest signal to produce: a type check is one fast, deterministic command with no database to seed and no deploy behind it, so it's the one form of feedback an agent can afford to run after every change. The difficulty is that most backend mistakes never produce one.
A backend can tell you something is wrong in several ways, and they differ mostly in when they fire and how precise they are:
A type error fires at compile time and a failing test when the suite runs, both inside the loop where the agent can read the error and fix it. A staging error or a production incident fires after the agent has stopped, so closing those takes a human starting a new run. The arrow is the whole point: give a topic or a database a type and the same bug that used to surface in production becomes a compile error, dragged left across the line into the only window the agent can act in.
Each of these is slower and less precise than the one above it. For a person working on the code that's mostly a matter of convenience, but for an agent it decides whether a problem is one it can fix or one it never sees, because the agent is only present for the first two. It isn't watching staging and it isn't on call. It finishes when the feedback in front of it goes quiet, so anything that only shows up after that point might as well not exist as far as the agent is concerned.
A type error is also more useful than a failing test even when both fire, because it localizes the problem for you. A failing test might tell you the orders endpoint returned the wrong total, while a type error tells you that on line 42 a value is string | undefined where a number was expected. The agent doesn't have to spend turns narrowing down where the bug lives, which matters over a long run where that narrowing is most of the work.
The hardest case to catch is the one where the code compiles, runs, and passes every test, and is still wrong. Nothing in the loop fires, so the agent finishes and moves on.
This is also the most common case in practice. Give an agent a realistic backend task and it tends to produce the simplest implementation that makes the tests pass, which often isn't one you'd want in production. A durable message queue becomes a Postgres table polled on a setInterval. A daily job becomes a setTimeout scheduled inside a single application process, which runs twice as soon as you scale to two instances. The database schema comes from CREATE TABLE IF NOT EXISTS statements at startup, with no migration history at all.
In each case the tests are green and nothing in the agent's loop flags a problem, so from where it's sitting the work is done. The common thread is that none of those mistakes involve anything the type system could see. The queue is a string of SQL, the schedule is a number passed to setTimeout, and the schema is a template string the compiler never looks inside. We've written before about how stringly-typed infrastructure hides this class of bug until production, where cached values outlive a deploy and a renamed field quietly comes back as undefined. A compiler can't object to a mistake that lives entirely inside a string.
This is where the line from our home page does real work: the structure your team and your AI agents need to ship safely to production. If the infrastructure an agent works with is typed, then using it incorrectly produces a type error, which is exactly the kind of signal the agent reliably notices and fixes on its own.
Encore types infrastructure by giving you primitives instead of strings. A topic is a Topic<OrderEvent> rather than a queue name you pass around, a database has a schema the framework tracks instead of a CREATE TABLE that happens to run at boot, and a call into another service goes through a generated, typed client rather than a URL and a JSON cast you assemble by hand.
import { api } from "encore.dev/api";
import { Topic } from "encore.dev/pubsub";
interface OrderEvent {
orderId: string;
total: number;
}
const orders = new Topic<OrderEvent>("orders", {
deliveryGuarantee: "at-least-once",
});
export const create = api(
{ expose: true, method: "POST", path: "/orders" },
async (req: CreateOrderRequest): Promise<Order> => {
const order = await placeOrder(req);
// The wrong shape here is a compile error, not a malformed message
// discovered by a subscriber three services away in production.
await orders.publish({ orderId: order.id, total: order.total });
return order;
},
);
If you publish the wrong shape to that topic, it's a compile error on that line before anything runs. If you call another service with a field its request type doesn't have, the generated client rejects it at compile time instead of returning a 400 that some subscriber surfaces later. Typed cache keyspaces behave the same way, where renaming a field breaks the build instead of silently returning stale data.
None of this forces the agent's hand. It could still ignore Topic and hand-roll a polling loop inside an Encore service, the same way it does on a bare framework. What tips the odds is that the typed primitive is also the least-effort path: it's less code than a queue table plus a poller, and it's what the framework's docs and the agent's context files point at. The agent reaches for it for the same lazy reason it reached for setInterval before, and this time the lazy choice is the one that holds up in production.
Underneath this, a Rust static analyzer reads these declarations at compile time and builds a full graph of the application, which works as a second compiler pass that understands infrastructure rather than syntax. It's the same graph that powers automatic distributed tracing and the MCP server that gives an agent the live service topology. For the agent, the effect is that a category of mistakes that used to surface in staging or production now surfaces as a build error while it's still working, where it has both the information and the chance to fix it.
| Stringly-typed infra | Typed primitives | |
|---|---|---|
| Wrong event payload published | passes tests, ships silently | compile error · Topic<T> |
| Cross-service call, missing field | 400 at runtime, in staging | compile error · typed client |
| Cron scheduled on every instance | double-fires at 2 replicas | runs once · cron primitive |
| Schema created at boot, no history | drifts silently per deploy | tracked migrations |
The top two are caught by the compiler as type errors, on the line, before anything runs. The bottom two aren't type errors at all; the framework primitive removes the failure mode, so the agent can't ship the broken version in the first place. Either way the silent green run on the left becomes something the agent has to deal with before it stops.
There's an obvious escape hatch, and agents use it. Faced with a type error, an agent can reach for as any or a @ts-ignore to make it go quiet instead of fixing what's underneath. The difference is that suppressing a type error is an explicit line in the diff: it's greppable, it's reviewable, and you can lint it to zero in CI. A silent runtime bug asks for none of that, it only asks the agent to do nothing. Turning a mistake into a type error doesn't make it impossible to ship, it makes shipping it a visible decision rather than an accident.
This still doesn't catch everything. A type system has nothing to say about whether your business logic is correct, and there's plenty it can't reach on its own: graceful shutdown, connection-pool sizing, how you handle secrets. But every mistake you can move from a silent runtime failure to a compile error is one more thing the agent handles before it ever reaches you.
When an agent ships weak backend code, the natural reaction is to reach for a better prompt or a larger model. That improves the first draft, but it doesn't change the underlying dynamic, which is that the agent stops when its feedback goes quiet and most infrastructure mistakes are quiet by default.
What changes the outcome is making more of those mistakes visible inside the loop. When the boundaries the agent works across are typed, misusing them turns into the one kind of signal it's almost guaranteed to act on. The agent ends up shipping better code, you end up trusting it with more, and the type errors it hits along the way are worth having, because each one is a problem caught in an editor instead of in production.


