05/26/26

NestJS vs Encore for AI Coding Agents

Claude Code on NestJS cost about twice as many tokens as the same agent on Encore, and shipped fewer working production-readiness checks.

5 Min Read

How does an AI coding agent compare on NestJS against Encore when you give it the same realistic backend tasks?

We took Claude Code, pointed it at the same project (an HTTP API with persistence, a pub/sub event, a daily cron, and distributed tracing), and ran it on both frameworks using the same prompts, the same model, the same Postgres setup, and the same VM. This article focuses on the NestJS side of our wider AI-readiness benchmark across five TypeScript frameworks. The headline numbers: across three runs NestJS cost $12.69 in token spend versus $6.29 on Encore, and on the production-readiness rubric in Run 3 NestJS landed 30 of 36 checks while Encore landed 36 of 36.

Full repo, prompts, starters, and transcripts at github.com/encoredev/ai-backend-benchmark.

How we tested it

Each framework gets its own VM with the same Postgres setup and the same claude-sonnet-4-6 model running through Claude Code. The agent works through three linked tasks: t1 (HTTP API and persistence), then t2 (extend t1 with pub/sub and cron), then t3 (extend t2 with tracing and production-readiness). The tests are plain black-box HTTP probes run with vitest, and they are the same against every framework. The NestJS starter is whatever the @nestjs/cli produces; the Encore starter is what encore app create produces, which also includes Encore's CLAUDE.md and MCP server, since the comparison we wanted is between the two frameworks as they arrive today.

What Claude wrote on NestJS

In Run 1 the agent hit 31/31 on NestJS at a median cost of $2.61, with one repeat running to $4.45 (the highest of any framework on the baseline run). The diffs converged on the same shape we saw across every non-Encore framework. For durable pub/sub the agent created a Postgres queue table and polled it from a service using setInterval. For the daily aggregation cron the agent either used setInterval inside a NestJS service or registered an @nestjs/schedule decorator (the latter is still in-process and fires once per replica). For schema the agent called CREATE TABLE IF NOT EXISTS on app startup. None of these patterns has retries with a dead-letter destination, multi-instance-safe scheduling, or versioned migrations, but all of them pass Run 1's test suite.

What Claude wrote on Encore

The same agent given the same prompts declared the async work using Encore's primitives. Pub/sub was a Topic with a typed Subscription and deliveryGuarantee: "at-least-once". The cron was a CronJob. Schema migrations went into numbered SQL files which Encore tracks and applies in order on every deploy. For the tracing task the agent did not thread request_id through function arguments because Encore's runtime propagates the correlation id automatically across the typed cross-service call, the subscription handler, and the cron invocation.

Where NestJS lost turns

NestJS's cost ramp was driven by two failure modes that other frameworks did not share. The first showed up in Run 2, where we pre-installed pg-boss, drizzle-kit, and pino in the starter with a README explaining what each library was for. The NestJS agent imported pg-boss into a NotificationsService but did not register the wrapping PgBossService in the module's providers array, so the dependency-injection container could not resolve it at boot and the service crashed. This is the kind of bug a NestJS developer would catch in seconds, but the test suite the agent was iterating against did not surface it cleanly, and Claude spent the rest of the turn budget chasing symptoms downstream of the wiring error.

The second showed up in Run 3, where the agent shipped a TypeScript error on what should have been the unmodified portion of the NestJS starter, breaking the typecheck probe. Combined with a failed tracing implementation, NestJS finished Run 3 at 30 of 36 checks and a per-run cost of $5.95, which is more than double Encore's $2.58 for the same task with all 36 checks passing.

Why these implementations aren't equivalent

Both frameworks pass the same test suite in Run 1, but they behave very differently the moment you deploy.

What the agent built on NestJSProduction weaknessWhat the agent built on Encore
Postgres queue polled by a NestJS service on setIntervalThe application database doubles as the event bus, and there is no dead-letter destination, so a poison message retries forever and can block everything behind it.Topic with at-least-once delivery, retries and a platform-managed DLQ configured at the framework level.
setInterval (or @nestjs/schedule decorator) inside the app processFires once per replica. The agent's idempotency saved this app, but a non-idempotent cron would run three times per tick on three replicas.CronJob declared at the framework level, invoked once per tick across the fleet by an external scheduler.
CREATE TABLE IF NOT EXISTS at bootNo migration history. Column renames and backfills have no version to roll back to.Numbered SQL migrations tracked in a _migrations table, applied in order on every deploy.

Why NestJS cost more

The pattern across both runs was the same: NestJS asks the agent to maintain wiring (modules, providers, decorators) on top of the application logic, and the wiring is where the agent gets things wrong. Every cross-cutting change is a chance to forget a providers entry or to import a type that no longer exists. Higher turn counts and longer transcripts compound into a higher token bill. For a team running this kind of agent-driven workflow daily, the per-run difference shows up in the monthly bill.

When NestJS is still the right choice

NestJS remains a reasonable pick if your team has deep Angular or NestJS muscle memory, if you have an existing NestJS monolith you do not want to migrate away from, or if the structure of decorators and dependency injection is something humans on your team value enough to absorb the higher AI iteration cost.

When Encore is the right choice

If you are starting a new TypeScript backend in 2026 and an AI agent is writing a meaningful share of the code, Encore's primitives let the agent reach for the right thing on the first pass and let production-readiness checks land with one-line changes against existing declarations.

Reproduce the benchmark

Clone the repo, point it at your own framework, or rewrite the rubric to match your own definition of production-ready: github.com/encoredev/ai-backend-benchmark.

Ready to build your next backend?

Encore is the Open Source framework for building robust type-safe distributed systems with declarative infrastructure.