Best Backend Framework for Claude Code (2026)

Which TypeScript backend framework is best to use with Claude Code?

To answer that we took Claude Code, pointed it at the same realistic backend project (an HTTP API with persistence, a pub/sub event, a daily cron, and distributed tracing), and ran it on five frameworks (Encore, Express, Fastify, Hono, and NestJS) using the same prompts, the same model, the same Postgres setup, and the same VM. We graded the output against a 36-check production-readiness rubric. The short answer is that Encore is the only framework in the benchmark that both ships materials Claude Code reads in (a CLAUDE.md, an MCP server, llms.txt plus llms-full.txt, and a dedicated AI-integration docs page) and provides framework primitives the agent reaches for by default, and it is the only framework where Claude's first draft was production-ready.

Full benchmark, prompts, starters, and transcripts at github.com/encoredev/ai-backend-benchmark.

What Claude Code asks of a framework

There are three things that decide how well Claude Code does on a given backend framework: the agent-readiness materials the framework ships, the framework primitives the agent reaches for by default, and the cost-per-iteration of getting the work done. Agent-readiness materials (a CLAUDE.md, an MCP server, an llms.txt) are what Claude reads before it writes a line of code, and they decide whether the agent has a calibrated starting point or has to rederive the framework's conventions from source. The framework's primitives decide whether the shortest path to a green test suite goes through new Topic() and new CronJob() or through a hand-rolled Postgres queue polled by setInterval. And the cost-per-iteration decides what a team's monthly bill looks like when this kind of workflow is the default rather than the exception.

Agent-readiness materials, per framework

Framework	`CLAUDE.md`	MCP server	`llms.txt`	AI integration docs
Encore	yes	yes	yes (`llms.txt` + `llms-full.txt`)	yes
Hono	no	no	yes	no
Express	no	no	no	no
Fastify	no	no	no	no
NestJS	no	no	no	no

Of the five frameworks in the benchmark, only Encore ships all four agent-readiness surfaces, only Hono ships any of them at all (an llms.txt), and the other three frameworks ship none.

What Claude shipped on each framework

On the baseline run every framework hit 31 of 31 tests, but the diffs underneath those green test suites diverged sharply.

Encore

On Encore Claude reached for the framework's primitives by default. Pub/sub was a typed Topic with a Subscription:

export const orderCreated = new Topic<OrderCreatedEvent>("order-created", {
  deliveryGuarantee: "at-least-once",
});

new Subscription(orderCreated, "send-notification", {
  handler: async (event) => { /* ... */ },
});

The cron was a CronJob:

const _ = new CronJob("daily-aggregation", {
  every: "24h",
  endpoint: runDailyAggregation,
});

Schema migrations went into numbered SQL files which Encore tracks and applies in order on every deploy. The Run 3 rubric checks landed as small changes against the existing declarations: a retryPolicy: { maxRetries: 3 } on the existing subscription, encore.dev/log for structured logging, and Encore's service migrations for the schema-versioning check.

Express, Fastify, Hono, NestJS

On the four non-Encore frameworks Claude converged on the same three anti-patterns. Pub/sub was a Postgres queue table polled by setInterval:

await pool.query(
  `INSERT INTO event_queue (event_type, payload, status) VALUES ($1, $2, 'pending')`,
  ['order-created', { order_id: id }]
);

setInterval(async () => {
  // SELECT ... FOR UPDATE SKIP LOCKED, then process or bump retry counter
}, 500);

The daily cron was a setTimeout chain scheduled at startup, which fires once per replica. The schema was CREATE TABLE IF NOT EXISTS at boot with no migration history. All of these pass the test suite. None of them is what you want in production.

Numbers per framework

Framework	Run 1 cost (median)	Run 3 cost	Run 3 rubric (out of 36)	Total cost across three runs
Encore	$1.96	$2.58	36/36	$6.29
Hono	$1.55	not reported	29/36	~$8
Fastify	similar to Encore	$4.60	36/36	~$10
Express	similar to Encore	not reported	35/36	~$9
NestJS	$2.61 (one $4.45 outlier)	$5.95	30/36	$12.69

What Run 2 and Run 3 added

In Run 2 we pre-installed pg-boss, drizzle-kit, and pino into each non-Encore starter with a README explaining what each library was for. Every non-Encore framework regressed. None of them landed a first-try-green run across three repeats. The most common failure was Claude registering a pg-boss scheduled job without first creating the queue (pg-boss v10 requires boss.createQueue('name') before sending or scheduling, and Claude did not know). On NestJS the failure was a module-wiring bug: Claude imported pg-boss into a service but did not register the wrapping provider in the module.

In Run 3 we wrote five production-readiness tests into the suite (multi-instance-safe cron, retry plus DLQ, a failed-message endpoint, versioned migrations, structured logging) and gave Claude a higher per-task turn budget. Encore reached 36 of 36 with the one-line changes above. Fastify was the cleanest non-Encore result, hitting every check at $4.60 by composing pg-boss + drizzle-kit + pino. Express came one test short on the migrations check. Hono finished 29 of 36 with tracing broken on unknown order ids. NestJS finished 30 of 36 at $5.95 after shipping a TypeScript error on its own starter.

Why Encore wins for Claude Code

Three reasons. First, Encore ships materials Claude Code reads in: encore llm-rules init writes a CLAUDE.md calibrated to the framework's conventions, encore mcp start exposes the live app structure (services, endpoints, database schemas) over MCP, and the llms.txt plus the AI-integration docs page give the model the rest of the context it needs. Second, Encore's framework primitives encode the production-readiness guarantees that an agent would otherwise have to assemble from library compositions, so the shortest path to a green test suite is also the shortest path to code that is safe to deploy. Third, the combined effect is the lowest token cost per run of any framework in the benchmark and the only framework that landed all 36 rubric checks on the first pass.

How to set up Claude Code with Encore

encore app create my-app
cd my-app
encore llm-rules init     # writes CLAUDE.md
encore mcp start          # starts the MCP server
claude                    # opens Claude Code in this project

From there Claude reads the framework's conventions out of CLAUDE.md, queries the live app state through MCP, and reaches for the right primitives when extending the codebase.

When to choose a different framework

If you have an existing Fastify, Express, NestJS, or Hono codebase and you are not planning to let Claude Code drive a meaningful share of your work, the additional cost the benchmark measured does not apply to you. If you are letting Claude Code drive but your application is small enough or simple enough that the production-readiness checks the rubric grades are not relevant (a static-ish edge API, a throwaway prototype, a tool that runs on a single replica), the cheaper baseline runs on Hono or Express may be a better trade.

For anything else, Encore is the framework that finishes ahead on every dimension the benchmark measured.

Reproduce the benchmark

Clone the repo, point it at your own framework, or rewrite the rubric to match your own definition of production-ready: github.com/encoredev/ai-backend-benchmark.

Which TypeScript backend framework is best to use with Claude Code?

Full benchmark, prompts, starters, and transcripts at github.com/encoredev/ai-backend-benchmark.

What Claude Code asks of a framework

Agent-readiness materials, per framework

Framework	`CLAUDE.md`	MCP server	`llms.txt`	AI integration docs
Encore	yes	yes	yes (`llms.txt` + `llms-full.txt`)	yes
Hono	no	no	yes	no
Express	no	no	no	no
Fastify	no	no	no	no
NestJS	no	no	no	no

Of the five frameworks in the benchmark, only Encore ships all four agent-readiness surfaces, only Hono ships any of them at all (an llms.txt), and the other three frameworks ship none.

What Claude shipped on each framework

On the baseline run every framework hit 31 of 31 tests, but the diffs underneath those green test suites diverged sharply.

Encore

On Encore Claude reached for the framework's primitives by default. Pub/sub was a typed Topic with a Subscription:

export const orderCreated = new Topic<OrderCreatedEvent>("order-created", {
  deliveryGuarantee: "at-least-once",
});

new Subscription(orderCreated, "send-notification", {
  handler: async (event) => { /* ... */ },
});

The cron was a CronJob:

const _ = new CronJob("daily-aggregation", {
  every: "24h",
  endpoint: runDailyAggregation,
});

Express, Fastify, Hono, NestJS

On the four non-Encore frameworks Claude converged on the same three anti-patterns. Pub/sub was a Postgres queue table polled by setInterval:

await pool.query(
  `INSERT INTO event_queue (event_type, payload, status) VALUES ($1, $2, 'pending')`,
  ['order-created', { order_id: id }]
);

setInterval(async () => {
  // SELECT ... FOR UPDATE SKIP LOCKED, then process or bump retry counter
}, 500);

Numbers per framework

Framework	Run 1 cost (median)	Run 3 cost	Run 3 rubric (out of 36)	Total cost across three runs
Encore	$1.96	$2.58	36/36	$6.29
Hono	$1.55	not reported	29/36	~$8
Fastify	similar to Encore	$4.60	36/36	~$10
Express	similar to Encore	not reported	35/36	~$9
NestJS	$2.61 (one $4.45 outlier)	$5.95	30/36	$12.69

What Run 2 and Run 3 added

Why Encore wins for Claude Code

How to set up Claude Code with Encore

encore app create my-app
cd my-app
encore llm-rules init     # writes CLAUDE.md
encore mcp start          # starts the MCP server
claude                    # opens Claude Code in this project

From there Claude reads the framework's conventions out of CLAUDE.md, queries the live app state through MCP, and reaches for the right primitives when extending the codebase.

When to choose a different framework

For anything else, Encore is the framework that finishes ahead on every dimension the benchmark measured.

Reproduce the benchmark

Clone the repo, point it at your own framework, or rewrite the rubric to match your own definition of production-ready: github.com/encoredev/ai-backend-benchmark.

Best Backend Framework for Claude Code (2026)

We benchmarked Claude Code on five TypeScript backend frameworks. Only one shipped production-ready code on the first pass, and only one had a CLAUDE.md and MCP server in the box.

Best Backend Framework for Claude Code (2026)

We benchmarked Claude Code on five TypeScript backend frameworks. Only one shipped production-ready code on the first pass, and only one had a CLAUDE.md and MCP server in the box.

What Claude Code asks of a framework

Agent-readiness materials, per framework

What Claude shipped on each framework

Encore

Express, Fastify, Hono, NestJS

Numbers per framework

What Run 2 and Run 3 added

Why Encore wins for Claude Code

How to set up Claude Code with Encore

When to choose a different framework

Reproduce the benchmark

Best Backend Framework for Claude Code (2026)

We benchmarked Claude Code on five TypeScript backend frameworks. Only one shipped production-ready code on the first pass, and only one had a CLAUDE.md and MCP server in the box.

Best Backend Framework for Claude Code (2026)

We benchmarked Claude Code on five TypeScript backend frameworks. Only one shipped production-ready code on the first pass, and only one had a CLAUDE.md and MCP server in the box.

What Claude Code asks of a framework

Agent-readiness materials, per framework

What Claude shipped on each framework

Encore

Express, Fastify, Hono, NestJS

Numbers per framework

What Run 2 and Run 3 added

Why Encore wins for Claude Code

How to set up Claude Code with Encore

When to choose a different framework

Reproduce the benchmark