03/09/26

pgvector vs Qdrant in 2026

One database with a vector extension, or two databases each doing one thing well?

9 Min Read

You're adding vector search to your backend. You already run PostgreSQL for application data. The question is whether to extend it with pgvector or add Qdrant as a dedicated vector database alongside it.

Both are open-source. Both handle similarity search with HNSW indexing and various distance metrics. The core difference is operational: pgvector adds vector search to the database you already have, keeping documents and embeddings in the same table. Qdrant is a standalone service built from the ground up for vector workloads, written in Rust, with a richer feature set for advanced filtering and quantization.

The tradeoff is simplicity versus specialization. pgvector means one database. Qdrant means two databases and sync logic between them, but more headroom for pure vector workloads at scale.

Quick Comparison

AspectpgvectorQdrant
What it isPostgreSQL extensionDedicated vector database
LanguageC (Postgres extension)Rust
Open sourceYes (PostgreSQL license)Yes (Apache 2.0)
SetupCREATE EXTENSION vectorDocker container or Qdrant Cloud
InfrastructureYour existing PostgresSeparate service
Index typesHNSW, IVFFlatHNSW (with quantization options)
Max practical scaleMillions (single instance)Hundreds of millions (distributed)
Query latency5-50ms5-30ms
ConsistencyTransactional (same DB as app data)Eventually consistent
FilteringSQL WHERE clauses, joins, subqueriesPayload filtering (indexed fields)
QuantizationHalf-vector supportScalar + product quantization
Cost modelPart of your Postgres billSelf-hosted or Qdrant Cloud pricing

Setup

pgvector

If you run Postgres, pgvector is a migration:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  category TEXT,
  embedding vector(1536)
);

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

That's it. Your embeddings live alongside your application data. Every tool, ORM, and monitoring system that works with Postgres works with pgvector.

Qdrant

Qdrant runs as a separate service. The fastest local setup is Docker:

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

Then create a collection and insert data through the REST API or client SDK:

import { QdrantClient } from "@qdrant/js-client-rest";

const client = new QdrantClient({ url: "http://localhost:6333" });

await client.createCollection("documents", {
  vectors: { size: 1536, distance: "Cosine" },
});

await client.upsert("documents", {
  points: [{
    id: 1,
    vector: embedding,
    payload: { title: "Getting started", category: "docs" },
  }],
});

Qdrant also offers Qdrant Cloud as a managed option if you don't want to run the Docker container yourself.

The setup difference matters in the long run. pgvector uses your existing Postgres infrastructure: same backups, same monitoring, same credentials. Qdrant is a second service with its own deployment, its own storage, and its own failure modes.

Query Performance

Both use HNSW indexes for approximate nearest-neighbor search, so baseline query performance is similar for typical workloads.

pgvector returns results in 5-50ms depending on dataset size, dimensions, and HNSW parameters. For 1M vectors with 1536 dimensions on a db.r6g.xlarge RDS instance, expect 10-20ms per query at 95%+ recall. Performance depends on how much of the HNSW index fits in shared_buffers.

Qdrant returns results in 5-30ms for similar workloads. Being purpose-built for vector operations, Qdrant's Rust implementation has optimizations that Postgres can't match: SIMD-accelerated distance calculations, memory-mapped storage, and a query engine designed specifically for nearest-neighbor search.

Where Qdrant pulls ahead: at higher vector counts (10M+) and higher query concurrency. Qdrant's architecture is optimized for these workloads. pgvector shares resources with your other Postgres queries, meaning heavy relational workloads and heavy vector workloads compete for the same instance resources.

For datasets under a few million vectors with moderate query volume, both perform well enough that other factors (consistency, filtering, ops overhead) should drive the decision.

Filtering

Filtering is where the architecture difference matters most.

pgvector uses SQL. Your vectors live in the same table as your application data, so filtering is a WHERE clause:

SELECT id, title, content,
       1 - (embedding <=> $1) AS similarity
FROM documents
WHERE category = 'engineering'
  AND created_at > NOW() - INTERVAL '30 days'
  AND author_id IN (SELECT id FROM authors WHERE team = 'backend')
ORDER BY embedding <=> $1
LIMIT 10;

You can join against other tables, use subqueries, apply any Postgres function, and combine vector similarity with relational filters in a single query. The Postgres query planner handles the execution strategy.

Qdrant has its own payload filtering system with indexed fields:

const results = await client.search("documents", {
  vector: queryEmbedding,
  limit: 10,
  filter: {
    must: [
      { key: "category", match: { value: "engineering" } },
      { key: "created_at", range: { gt: "2026-02-01T00:00:00Z" } },
    ],
  },
});

Qdrant's payload filtering is efficient. It creates indexes on specified payload fields and integrates filtering with the HNSW search. For straightforward filters (equals, range, geo), it works well. But you can't join against other data, run subqueries, or use the full expressiveness of SQL.

If your filtering needs are simple (filter by tenant, category, date range), both work fine. If you need complex relational queries combined with vector search, pgvector is significantly more capable.

Quantization and Memory Efficiency

Qdrant has a clear advantage for memory-constrained deployments. It supports scalar quantization (reducing float32 to int8, cutting memory by 4x) and product quantization (compressing vectors further at the cost of some recall). These are configurable per collection:

await client.createCollection("documents", {
  vectors: { size: 1536, distance: "Cosine" },
  quantization_config: {
    scalar: { type: "int8", quantile: 0.99, always_ram: true },
  },
});

Quantization lets you store more vectors on the same hardware. For a dataset that would require a 32GB instance with full float32 vectors, scalar quantization might fit it on an 8GB instance with minimal recall loss.

pgvector added support for halfvec (half-precision vectors) that reduces storage by 2x. This is less flexible than Qdrant's quantization options but covers the most common use case. Full product quantization isn't available natively in pgvector.

If you're optimizing for memory usage and need to fit the maximum number of vectors on available hardware, Qdrant gives you more knobs to turn.

Distributed Scaling

Qdrant supports distributed mode with sharding and replication. You can spread your collection across multiple nodes, replicate shards for availability, and scale reads by adding replicas. This is real horizontal scaling for vector workloads.

pgvector scales with your Postgres instance, vertically. A larger instance means more RAM for the HNSW index. You can partition tables to keep individual indexes smaller, and you can use read replicas for query scaling, but you can't shard a vector index across multiple Postgres instances without significant application-level logic.

For workloads under 5 million vectors, vertical scaling with Postgres is fine. Beyond that, Qdrant's distributed mode gives you a path to hundreds of millions of vectors across a cluster.

Consistency

This is pgvector's strongest advantage.

pgvector: Documents and embeddings live in the same Postgres table. An INSERT writes both atomically. An UPDATE or DELETE affects both in the same transaction. Your search results are always consistent with your application state. There is no sync problem because there is nothing to sync.

Qdrant: Your documents live in Postgres and your vectors live in Qdrant. When you create a document, you write to both. When you delete one, you delete from both. If one write fails, you have either a document without an embedding (invisible to search) or an orphaned vector (search returns a result that no longer exists). You need retry logic, a dead-letter queue, or a background reconciliation job.

For many applications, the sync overhead is manageable. But it's engineering effort that doesn't exist with pgvector. If your application requires strict consistency between application state and search results, pgvector eliminates an entire class of bugs.

Operational Overhead

pgvector adds zero new infrastructure. Your embeddings are Postgres columns. Backups, monitoring, failover, access control: everything uses your existing Postgres setup. If you run managed Postgres (RDS, Cloud SQL, Neon), pgvector is supported and the management story doesn't change.

Qdrant is a separate service. Self-hosted, that means a Docker container (or Kubernetes deployment), its own storage volume, its own monitoring, and its own upgrade path. Qdrant Cloud removes the hosting burden but adds another vendor, another API key, and another service in your architecture.

For a team of five shipping AI features alongside everything else, the operational difference between maintaining one database and maintaining two is significant. For a larger team with dedicated infrastructure, adding Qdrant is standard practice.

If you use a framework that handles database provisioning, the Postgres side becomes simpler. With Encore, for example, databases are declared in code and provisioned automatically, locally and in production:

import { SQLDatabase } from "encore.dev/storage/sqldb";

const db = new SQLDatabase("search", {
  migrations: "./migrations",
});

The migration enables pgvector, and the database is available with no connection strings or environment variables to configure. Qdrant still needs its own deployment regardless of how you manage Postgres.

When to Choose pgvector

pgvector is the better choice when:

  • You already run Postgres. Adding vector search is one migration, not a new service.
  • Your dataset is under 5 million vectors. Performance is well within acceptable range.
  • Transactional consistency matters. Documents and embeddings in the same transaction eliminates sync bugs.
  • You need complex filtering. SQL joins and subqueries beat any proprietary filter syntax.
  • You want minimal operational overhead. One database to manage, not two.
  • Your team is small. Every additional service you run has a cost in attention and maintenance.

If you want to try it, see Getting Started with pgvector below for a working setup in a few lines of code.

When to Choose Qdrant

Qdrant is the better choice when:

  • You need to scale beyond what a single Postgres instance handles. Qdrant's distributed mode supports hundreds of millions of vectors across a cluster.
  • You need advanced quantization. Scalar and product quantization let you fit more vectors in less memory.
  • Pure vector workload performance matters. Qdrant's Rust engine with SIMD acceleration outperforms pgvector on raw vector throughput at scale.
  • You prefer open-source and self-hosted. Qdrant is Apache 2.0 licensed and runs anywhere.
  • You're already running Qdrant. If your infrastructure includes Qdrant, the operational cost is already paid.

For a practical tutorial building semantic search with Qdrant and Encore, see Building Semantic Search with Qdrant. For a broader comparison of all vector database options, see Best Vector Databases in 2026.

Getting Started with pgvector

For most teams, pgvector is the simpler starting point. You avoid a second service, get transactional consistency, and use the SQL you already know.

If you're building a TypeScript backend, Encore provisions PostgreSQL with pgvector automatically. Declare a database, write a migration, and vector search works locally and in production:

import { api } from "encore.dev/api";
import { SQLDatabase } from "encore.dev/storage/sqldb";

const db = new SQLDatabase("search", {
  migrations: "./migrations",
});

export const search = api(
  { expose: true, method: "POST", path: "/search" },
  async (req: { query: string; limit?: number }): Promise<{ results: SearchResult[] }> => {
    const embedding = await generateEmbedding(req.query);

    const rows = await db.query<SearchResult>`
      SELECT id, title, content,
             1 - (embedding <=> ${embedding}::vector) AS similarity
      FROM documents
      ORDER BY embedding <=> ${embedding}::vector
      LIMIT ${req.limit ?? 5}
    `;

    const results: SearchResult[] = [];
    for await (const row of rows) {
      results.push(row);
    }

    return { results };
  }
);

For a complete tutorial building a RAG pipeline with pgvector, see How to Build a RAG Pipeline with TypeScript.

If your workload outgrows pgvector (tens of millions of vectors, need for distributed search, or advanced quantization requirements), Qdrant is a natural next step and a strong open-source option.


Have questions about vector search architecture? Join our Discord community where developers discuss infrastructure decisions daily.

Ready to build your next backend?

Encore is the Open Source framework for building robust type-safe distributed systems with declarative infrastructure.