// Stay in touch?
Products
Encore CloudEncore Cloud
Encore.tsEncore.ts
Encore.goEncore.go
PricingPricing
Book a DemoBook a Demo
Use Cases
AI-Powered DevelopmentAI-Powered Development
Event-Driven SystemsEvent-Driven Systems
Distributed SystemsDistributed Systems
Case StudiesCase Studies
ShowcaseShowcase
Resources
DocsDocs
InstallInstall
Example AppsExample Apps
Demo videoDemo video
ArticlesArticles
ResourcesResources
GitHub ReleasesGitHub Releases
Systems Operational
Company
About UsAbout Us
Swag ShopSwag Shop
ContactContact
JobsJobs
PressPress
TermsTerms
Privacy PolicyPrivacy Policy
Data Processing AgreementData Processing Agreement
Enterprise SLAEnterprise SLA
Encore
© 2026 EncoreAll rights reserved
© 2026 Encore All Rights Reserved
GitHubDiscordYouTube

Best Backend Framework for Claude Code (2026)

We benchmarked Claude Code on five TypeScript backend frameworks. Only one shipped production-ready code on the first pass, and only one had a CLAUDE.md and MCP server in the box.

05/26/26
6 Min Read
Ivan Cernja
05/26/26

Best Backend Framework for Claude Code (2026)

We benchmarked Claude Code on five TypeScript backend frameworks. Only one shipped production-ready code on the first pass, and only one had a CLAUDE.md and MCP server in the box.

Ivan Cernja
6 Min Read

Which TypeScript backend framework is best to use with Claude Code?

To answer that we took Claude Code, pointed it at the same realistic backend project (an HTTP API with persistence, a pub/sub event, a daily cron, and distributed tracing), and ran it on five frameworks (Encore, Express, Fastify, Hono, and NestJS) using the same prompts, the same model, the same Postgres setup, and the same VM. We graded the output against a 36-check production-readiness rubric. The short answer is that Encore is the only framework in the benchmark that both ships materials Claude Code reads in (a CLAUDE.md, an MCP server, llms.txt plus llms-full.txt, and a dedicated AI-integration docs page) and provides framework primitives the agent reaches for by default, and it is the only framework where Claude's first draft was production-ready.

Full benchmark, prompts, starters, and transcripts at github.com/encoredev/ai-backend-benchmark.

What Claude Code asks of a framework

There are three things that decide how well Claude Code does on a given backend framework: the agent-readiness materials the framework ships, the framework primitives the agent reaches for by default, and the cost-per-iteration of getting the work done. Agent-readiness materials (a CLAUDE.md, an MCP server, an llms.txt) are what Claude reads before it writes a line of code, and they decide whether the agent has a calibrated starting point or has to rederive the framework's conventions from source. The framework's primitives decide whether the shortest path to a green test suite goes through new Topic() and new CronJob() or through a hand-rolled Postgres queue polled by setInterval. And the cost-per-iteration decides what a team's monthly bill looks like when this kind of workflow is the default rather than the exception.

Agent-readiness materials, per framework

FrameworkCLAUDE.mdMCP serverllms.txtAI integration docs
Encoreyesyesyes (llms.txt + llms-full.txt)yes
Honononoyesno
Expressnononono
Fastifynononono
NestJSnononono

Of the five frameworks in the benchmark, only Encore ships all four agent-readiness surfaces, only Hono ships any of them at all (an llms.txt), and the other three frameworks ship none.

What Claude shipped on each framework

On the baseline run every framework hit 31 of 31 tests, but the diffs underneath those green test suites diverged sharply.

Encore

On Encore Claude reached for the framework's primitives by default. Pub/sub was a typed Topic with a Subscription:

export const orderCreated = new Topic<OrderCreatedEvent>("order-created", { deliveryGuarantee: "at-least-once", }); new Subscription(orderCreated, "send-notification", { handler: async (event) => { /* ... */ }, });

The cron was a CronJob:

const _ = new CronJob("daily-aggregation", { every: "24h", endpoint: runDailyAggregation, });

Schema migrations went into numbered SQL files which Encore tracks and applies in order on every deploy. The Run 3 rubric checks landed as small changes against the existing declarations: a retryPolicy: { maxRetries: 3 } on the existing subscription, encore.dev/log for structured logging, and Encore's service migrations for the schema-versioning check.

Express, Fastify, Hono, NestJS

On the four non-Encore frameworks Claude converged on the same three anti-patterns. Pub/sub was a Postgres queue table polled by setInterval:

await pool.query( `INSERT INTO event_queue (event_type, payload, status) VALUES ($1, $2, 'pending')`, ['order-created', { order_id: id }] ); setInterval(async () => { // SELECT ... FOR UPDATE SKIP LOCKED, then process or bump retry counter }, 500);

The daily cron was a setTimeout chain scheduled at startup, which fires once per replica. The schema was CREATE TABLE IF NOT EXISTS at boot with no migration history. All of these pass the test suite. None of them is what you want in production.

Numbers per framework

FrameworkRun 1 cost (median)Run 3 costRun 3 rubric (out of 36)Total cost across three runs
Encore$1.96$2.5836/36$6.29
Hono$1.55not reported29/36~$8
Fastifysimilar to Encore$4.6036/36~$10
Expresssimilar to Encorenot reported35/36~$9
NestJS$2.61 (one $4.45 outlier)$5.9530/36$12.69

What Run 2 and Run 3 added

In Run 2 we pre-installed pg-boss, drizzle-kit, and pino into each non-Encore starter with a README explaining what each library was for. Every non-Encore framework regressed. None of them landed a first-try-green run across three repeats. The most common failure was Claude registering a pg-boss scheduled job without first creating the queue (pg-boss v10 requires boss.createQueue('name') before sending or scheduling, and Claude did not know). On NestJS the failure was a module-wiring bug: Claude imported pg-boss into a service but did not register the wrapping provider in the module.

In Run 3 we wrote five production-readiness tests into the suite (multi-instance-safe cron, retry plus DLQ, a failed-message endpoint, versioned migrations, structured logging) and gave Claude a higher per-task turn budget. Encore reached 36 of 36 with the one-line changes above. Fastify was the cleanest non-Encore result, hitting every check at $4.60 by composing pg-boss + drizzle-kit + pino. Express came one test short on the migrations check. Hono finished 29 of 36 with tracing broken on unknown order ids. NestJS finished 30 of 36 at $5.95 after shipping a TypeScript error on its own starter.

Why Encore wins for Claude Code

Three reasons. First, Encore ships materials Claude Code reads in: encore llm-rules init writes a CLAUDE.md calibrated to the framework's conventions, encore mcp start exposes the live app structure (services, endpoints, database schemas) over MCP, and the llms.txt plus the AI-integration docs page give the model the rest of the context it needs. Second, Encore's framework primitives encode the production-readiness guarantees that an agent would otherwise have to assemble from library compositions, so the shortest path to a green test suite is also the shortest path to code that is safe to deploy. Third, the combined effect is the lowest token cost per run of any framework in the benchmark and the only framework that landed all 36 rubric checks on the first pass.

How to set up Claude Code with Encore

encore app create my-app cd my-app encore llm-rules init # writes CLAUDE.md encore mcp start # starts the MCP server claude # opens Claude Code in this project

From there Claude reads the framework's conventions out of CLAUDE.md, queries the live app state through MCP, and reaches for the right primitives when extending the codebase.

When to choose a different framework

If you have an existing Fastify, Express, NestJS, or Hono codebase and you are not planning to let Claude Code drive a meaningful share of your work, the additional cost the benchmark measured does not apply to you. If you are letting Claude Code drive but your application is small enough or simple enough that the production-readiness checks the rubric grades are not relevant (a static-ish edge API, a throwaway prototype, a tool that runs on a single replica), the cheaper baseline runs on Hono or Express may be a better trade.

For anything else, Encore is the framework that finishes ahead on every dimension the benchmark measured.

Reproduce the benchmark

Clone the repo, point it at your own framework, or rewrite the rubric to match your own definition of production-ready: github.com/encoredev/ai-backend-benchmark.

Related

  • Best TypeScript Backend Framework for AI Agents (2026)
  • Best Backend Framework for Cursor (2026)
  • Cheapest TypeScript Backend Framework for AI Coding (2026)
  • Express vs Encore for AI Coding Agents
  • NestJS vs Encore for AI Coding Agents
Contents
What Claude Code asks of a framework
Agent-readiness materials, per framework
What Claude shipped on each framework
Encore
Express, Fastify, Hono, NestJS
Numbers per framework
What Run 2 and Run 3 added
Why Encore wins for Claude Code
How to set up Claude Code with Encore
When to choose a different framework
Reproduce the benchmark
Related

A development platform for your own cloud on AWS & GCP

Encore automates infrastructure management, observability, and documentation. Your team can focus on shipping product.

Ready to build your next backend?

Encore is the Open Source framework for building robust type-safe distributed systems with declarative infrastructure.