You're adding vector search to your backend. You already run PostgreSQL for application data. The question is whether to extend it with pgvector or add Qdrant as a dedicated vector database alongside it.
Both are open-source. Both handle similarity search with HNSW indexing and various distance metrics. The core difference is operational: pgvector adds vector search to the database you already have, keeping documents and embeddings in the same table. Qdrant is a standalone service built from the ground up for vector workloads, written in Rust, with a richer feature set for advanced filtering and quantization.
The tradeoff is simplicity versus specialization. pgvector means one database. Qdrant means two databases and sync logic between them, but more headroom for pure vector workloads at scale.
| Aspect | pgvector | Qdrant |
|---|---|---|
| What it is | PostgreSQL extension | Dedicated vector database |
| Language | C (Postgres extension) | Rust |
| Open source | Yes (PostgreSQL license) | Yes (Apache 2.0) |
| Setup | CREATE EXTENSION vector | Docker container or Qdrant Cloud |
| Infrastructure | Your existing Postgres | Separate service |
| Index types | HNSW, IVFFlat | HNSW (with quantization options) |
| Max practical scale | Millions (single instance) | Hundreds of millions (distributed) |
| Query latency | 5-50ms | 5-30ms |
| Consistency | Transactional (same DB as app data) | Eventually consistent |
| Filtering | SQL WHERE clauses, joins, subqueries | Payload filtering (indexed fields) |
| Quantization | Half-vector support | Scalar + product quantization |
| Cost model | Part of your Postgres bill | Self-hosted or Qdrant Cloud pricing |
If you run Postgres, pgvector is a migration:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
category TEXT,
embedding vector(1536)
);
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
That's it. Your embeddings live alongside your application data. Every tool, ORM, and monitoring system that works with Postgres works with pgvector.
Qdrant runs as a separate service. The fastest local setup is Docker:
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
Then create a collection and insert data through the REST API or client SDK:
import { QdrantClient } from "@qdrant/js-client-rest";
const client = new QdrantClient({ url: "http://localhost:6333" });
await client.createCollection("documents", {
vectors: { size: 1536, distance: "Cosine" },
});
await client.upsert("documents", {
points: [{
id: 1,
vector: embedding,
payload: { title: "Getting started", category: "docs" },
}],
});
Qdrant also offers Qdrant Cloud as a managed option if you don't want to run the Docker container yourself.
The setup difference matters in the long run. pgvector uses your existing Postgres infrastructure: same backups, same monitoring, same credentials. Qdrant is a second service with its own deployment, its own storage, and its own failure modes.
Both use HNSW indexes for approximate nearest-neighbor search, so baseline query performance is similar for typical workloads.
pgvector returns results in 5-50ms depending on dataset size, dimensions, and HNSW parameters. For 1M vectors with 1536 dimensions on a db.r6g.xlarge RDS instance, expect 10-20ms per query at 95%+ recall. Performance depends on how much of the HNSW index fits in shared_buffers.
Qdrant returns results in 5-30ms for similar workloads. Being purpose-built for vector operations, Qdrant's Rust implementation has optimizations that Postgres can't match: SIMD-accelerated distance calculations, memory-mapped storage, and a query engine designed specifically for nearest-neighbor search.
Where Qdrant pulls ahead: at higher vector counts (10M+) and higher query concurrency. Qdrant's architecture is optimized for these workloads. pgvector shares resources with your other Postgres queries, meaning heavy relational workloads and heavy vector workloads compete for the same instance resources.
For datasets under a few million vectors with moderate query volume, both perform well enough that other factors (consistency, filtering, ops overhead) should drive the decision.
Filtering is where the architecture difference matters most.
pgvector uses SQL. Your vectors live in the same table as your application data, so filtering is a WHERE clause:
SELECT id, title, content,
1 - (embedding <=> $1) AS similarity
FROM documents
WHERE category = 'engineering'
AND created_at > NOW() - INTERVAL '30 days'
AND author_id IN (SELECT id FROM authors WHERE team = 'backend')
ORDER BY embedding <=> $1
LIMIT 10;
You can join against other tables, use subqueries, apply any Postgres function, and combine vector similarity with relational filters in a single query. The Postgres query planner handles the execution strategy.
Qdrant has its own payload filtering system with indexed fields:
const results = await client.search("documents", {
vector: queryEmbedding,
limit: 10,
filter: {
must: [
{ key: "category", match: { value: "engineering" } },
{ key: "created_at", range: { gt: "2026-02-01T00:00:00Z" } },
],
},
});
Qdrant's payload filtering is efficient. It creates indexes on specified payload fields and integrates filtering with the HNSW search. For straightforward filters (equals, range, geo), it works well. But you can't join against other data, run subqueries, or use the full expressiveness of SQL.
If your filtering needs are simple (filter by tenant, category, date range), both work fine. If you need complex relational queries combined with vector search, pgvector is significantly more capable.
Qdrant has a clear advantage for memory-constrained deployments. It supports scalar quantization (reducing float32 to int8, cutting memory by 4x) and product quantization (compressing vectors further at the cost of some recall). These are configurable per collection:
await client.createCollection("documents", {
vectors: { size: 1536, distance: "Cosine" },
quantization_config: {
scalar: { type: "int8", quantile: 0.99, always_ram: true },
},
});
Quantization lets you store more vectors on the same hardware. For a dataset that would require a 32GB instance with full float32 vectors, scalar quantization might fit it on an 8GB instance with minimal recall loss.
pgvector added support for halfvec (half-precision vectors) that reduces storage by 2x. This is less flexible than Qdrant's quantization options but covers the most common use case. Full product quantization isn't available natively in pgvector.
If you're optimizing for memory usage and need to fit the maximum number of vectors on available hardware, Qdrant gives you more knobs to turn.
Qdrant supports distributed mode with sharding and replication. You can spread your collection across multiple nodes, replicate shards for availability, and scale reads by adding replicas. This is real horizontal scaling for vector workloads.
pgvector scales with your Postgres instance, vertically. A larger instance means more RAM for the HNSW index. You can partition tables to keep individual indexes smaller, and you can use read replicas for query scaling, but you can't shard a vector index across multiple Postgres instances without significant application-level logic.
For workloads under 5 million vectors, vertical scaling with Postgres is fine. Beyond that, Qdrant's distributed mode gives you a path to hundreds of millions of vectors across a cluster.
This is pgvector's strongest advantage.
pgvector: Documents and embeddings live in the same Postgres table. An INSERT writes both atomically. An UPDATE or DELETE affects both in the same transaction. Your search results are always consistent with your application state. There is no sync problem because there is nothing to sync.
Qdrant: Your documents live in Postgres and your vectors live in Qdrant. When you create a document, you write to both. When you delete one, you delete from both. If one write fails, you have either a document without an embedding (invisible to search) or an orphaned vector (search returns a result that no longer exists). You need retry logic, a dead-letter queue, or a background reconciliation job.
For many applications, the sync overhead is manageable. But it's engineering effort that doesn't exist with pgvector. If your application requires strict consistency between application state and search results, pgvector eliminates an entire class of bugs.
pgvector adds zero new infrastructure. Your embeddings are Postgres columns. Backups, monitoring, failover, access control: everything uses your existing Postgres setup. If you run managed Postgres (RDS, Cloud SQL, Neon), pgvector is supported and the management story doesn't change.
Qdrant is a separate service. Self-hosted, that means a Docker container (or Kubernetes deployment), its own storage volume, its own monitoring, and its own upgrade path. Qdrant Cloud removes the hosting burden but adds another vendor, another API key, and another service in your architecture.
For a team of five shipping AI features alongside everything else, the operational difference between maintaining one database and maintaining two is significant. For a larger team with dedicated infrastructure, adding Qdrant is standard practice.
If you use a framework that handles database provisioning, the Postgres side becomes simpler. With Encore, for example, databases are declared in code and provisioned automatically, locally and in production:
import { SQLDatabase } from "encore.dev/storage/sqldb";
const db = new SQLDatabase("search", {
migrations: "./migrations",
});
The migration enables pgvector, and the database is available with no connection strings or environment variables to configure. Qdrant still needs its own deployment regardless of how you manage Postgres.
pgvector is the better choice when:
If you want to try it, see Getting Started with pgvector below for a working setup in a few lines of code.
Qdrant is the better choice when:
For a practical tutorial building semantic search with Qdrant and Encore, see Building Semantic Search with Qdrant. For a broader comparison of all vector database options, see Best Vector Databases in 2026.
For most teams, pgvector is the simpler starting point. You avoid a second service, get transactional consistency, and use the SQL you already know.
If you're building a TypeScript backend, Encore provisions PostgreSQL with pgvector automatically. Declare a database, write a migration, and vector search works locally and in production:
import { api } from "encore.dev/api";
import { SQLDatabase } from "encore.dev/storage/sqldb";
const db = new SQLDatabase("search", {
migrations: "./migrations",
});
export const search = api(
{ expose: true, method: "POST", path: "/search" },
async (req: { query: string; limit?: number }): Promise<{ results: SearchResult[] }> => {
const embedding = await generateEmbedding(req.query);
const rows = await db.query<SearchResult>`
SELECT id, title, content,
1 - (embedding <=> ${embedding}::vector) AS similarity
FROM documents
ORDER BY embedding <=> ${embedding}::vector
LIMIT ${req.limit ?? 5}
`;
const results: SearchResult[] = [];
for await (const row of rows) {
results.push(row);
}
return { results };
}
);
For a complete tutorial building a RAG pipeline with pgvector, see How to Build a RAG Pipeline with TypeScript.
If your workload outgrows pgvector (tens of millions of vectors, need for distributed search, or advanced quantization requirements), Qdrant is a natural next step and a strong open-source option.
Have questions about vector search architecture? Join our Discord community where developers discuss infrastructure decisions daily.