03/09/26

Best Vector Databases in 2026

A practical comparison of pgvector, Pinecone, Qdrant, Weaviate, Milvus, Chroma, and LanceDB

12 Min Read

Every AI feature that works with your own data (semantic search, RAG pipelines, recommendation engines, document classifiers) needs somewhere to store and query vector embeddings. The vector database market has grown from a handful of options to dozens, each with different tradeoffs around performance, operational complexity, cost, and scale.

This guide compares seven vector databases that cover the spectrum: from extending PostgreSQL with an extension to fully managed cloud services to embedded databases that run in-process. The right choice depends on your existing infrastructure, the scale of your workload, and how much operational overhead you're willing to take on.

Quick Comparison

DatabaseTypeHostingOpen SourceBest ScaleStandout Feature
pgvectorPostgres extensionSelf-hosted / managed PostgresYesMillionsSame DB as your app data
PineconeManaged SaaSPinecone cloudNoBillionsZero-ops serverless
QdrantDedicated vector DBSelf-hosted / Qdrant CloudYes (Apache 2.0)Hundreds of millionsPayload filtering + Rust perf
WeaviateDedicated vector DBSelf-hosted / Weaviate CloudYes (BSD-3)Hundreds of millionsBuilt-in vectorization modules
MilvusDedicated vector DBSelf-hosted / Zilliz CloudYes (Apache 2.0)BillionsGPU-accelerated, enterprise scale
ChromaEmbedded / client-serverIn-process or self-hostedYes (Apache 2.0)Hundreds of thousandsDeveloper experience, prototyping
LanceDBEmbeddedIn-process (serverless cloud in beta)Yes (Apache 2.0)MillionsZero-copy, columnar storage

pgvector: Best for Teams Already Running Postgres

pgvector is a PostgreSQL extension that adds a vector column type with support for cosine similarity, L2 distance, and inner product operations. It supports both HNSW and IVFFlat indexing.

Key features:

  • Vectors and application data in the same table, same transaction
  • Standard SQL for queries, filtering, and joins
  • HNSW and IVFFlat index types
  • Works with every managed Postgres provider (RDS, Cloud SQL, Supabase, Neon)
  • No additional service to deploy, monitor, or pay for

Best for:

  • Teams that already run Postgres and want to add vector search without adding infrastructure
  • Workloads under 5 million vectors where query latency is acceptable at 5-50ms
  • Applications where transactional consistency between documents and embeddings matters
  • Complex filtering that benefits from SQL joins and subqueries

Limitations:

  • Scales vertically (bigger instance = more vectors in memory). No built-in horizontal scaling for vector indexes.
  • Performance tuning at scale requires Postgres expertise (shared_buffers, work_mem, HNSW parameters).
  • Not designed for workloads above tens of millions of vectors on a single instance.

Example:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  content TEXT NOT NULL,
  embedding vector(1536)
);

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- Similarity search with SQL filtering
SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents
WHERE category = 'engineering'
ORDER BY embedding <=> $1
LIMIT 10;

pgvector is the default recommendation for teams that already have Postgres in their stack. It avoids the operational complexity of a second data store and gives you transactional consistency that dedicated vector databases can't match. If you want to try it out, see the getting started section below for a working setup in a few lines of code. For a detailed head-to-head comparison, see pgvector vs Pinecone.

Pinecone: Best Managed Vector Database

Pinecone is a fully managed vector database offered as a cloud service. You get an API key, create an index, and start querying. No instances to size, no indexes to tune, no infrastructure to manage.

Key features:

  • Serverless tier that auto-scales with query volume and data size
  • Namespace isolation for multi-tenant applications
  • Metadata filtering on vector results
  • Supports sparse-dense hybrid search
  • SOC 2 Type II compliance

Best for:

  • Teams that want zero operational overhead for vector search
  • Multi-tenant applications at scale where per-tenant isolation matters
  • Workloads at hundreds of millions or billions of vectors
  • Organizations that need managed compliance and SLAs

Limitations:

  • Proprietary, closed-source. Your data and indexes are in Pinecone's cloud.
  • Eventually consistent. Writes take time to become searchable.
  • Metadata filtering is limited compared to SQL (no joins, no subqueries).
  • Cost can scale quickly at high query volumes.

Example:

import { Pinecone } from "@pinecone-database/pinecone";

const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const index = pc.index("documents");

// Upsert
await index.upsert([{
  id: "doc-1",
  values: embedding,
  metadata: { category: "engineering" },
}]);

// Query
const results = await index.query({
  vector: queryEmbedding,
  topK: 10,
  filter: { category: { $eq: "engineering" } },
  includeMetadata: true,
});

Pinecone is the right choice when you need scale and managed infrastructure above everything else. For a detailed comparison with pgvector, see pgvector vs Pinecone.

Qdrant: Best Open-Source Dedicated Vector Database

Qdrant is an open-source vector database written in Rust. It's designed from the ground up for vector search, with a focus on performance, payload filtering, and a rich query API.

Key features:

  • Written in Rust with strong single-node performance
  • Rich payload filtering with indexed fields (numeric, keyword, geo, datetime)
  • Supports quantization (scalar and product) for memory efficiency
  • Distributed mode with sharding and replication
  • Available as self-hosted (Docker) or managed cloud (Qdrant Cloud)

Best for:

  • Teams that want a dedicated vector database but prefer open-source
  • Workloads that need complex filtering alongside vector similarity
  • Applications that benefit from Qdrant's quantization for large datasets on limited memory
  • Self-hosted deployments where you control the infrastructure

Limitations:

  • Another service to deploy, monitor, and maintain (unless using Qdrant Cloud)
  • Data lives separately from your application database, requires sync logic
  • Distributed mode adds operational complexity
  • Smaller ecosystem and community than Postgres-based solutions

Example:

import { QdrantClient } from "@qdrant/js-client-rest";

const client = new QdrantClient({ url: "http://localhost:6333" });

// Create collection
await client.createCollection("documents", {
  vectors: { size: 1536, distance: "Cosine" },
});

// Upsert
await client.upsert("documents", {
  points: [{
    id: 1,
    vector: embedding,
    payload: { category: "engineering", date: "2026-03-01" },
  }],
});

// Search with filtering
const results = await client.search("documents", {
  vector: queryEmbedding,
  limit: 10,
  filter: {
    must: [{ key: "category", match: { value: "engineering" } }],
  },
});

Qdrant is a strong choice if you need a dedicated vector database and want to stay open-source. If you're using Encore, we have a tutorial for building semantic search with Qdrant. For a comparison with pgvector, see pgvector vs Qdrant.

Weaviate: Best for Built-in Vectorization

Weaviate is an open-source vector database that goes beyond storage and search. It includes built-in modules for generating embeddings, so you can insert raw text and Weaviate handles the vectorization.

Key features:

  • Vectorization modules for OpenAI, Cohere, Hugging Face, and more (insert text, get vectors automatically)
  • GraphQL and REST APIs
  • Hybrid search combining vector similarity with BM25 keyword search
  • Multi-tenancy support
  • Available self-hosted or as Weaviate Cloud

Best for:

  • Teams that want the database to handle embedding generation
  • Applications that need hybrid search (semantic + keyword) built in
  • Use cases where the GraphQL query API is preferred
  • Rapid prototyping where managing embedding pipelines is overhead

Limitations:

  • Vectorization modules add latency and API costs (they call the same embedding APIs you'd call yourself)
  • GraphQL API adds a learning curve compared to SQL or simple REST
  • Resource-heavy for self-hosting (Java-based runtime)
  • Operational complexity increases with modules and multi-tenancy

Example:

import weaviate from "weaviate-client";

const client = await weaviate.connectToLocal();

// Insert raw text, Weaviate vectorizes it
await client.collections.get("Document").data.insert({
  title: "Vector Search Guide",
  content: "pgvector adds vector search to PostgreSQL...",
});

// Semantic search
const results = await client.collections.get("Document")
  .query.nearText("how does vector search work", { limit: 10 });

Weaviate is interesting if you want embedding generation baked into the database. But the vectorization modules are calling the same APIs you'd call in your application code, so the convenience comes with the tradeoff of less control over the embedding pipeline.

Milvus: Best for Enterprise Scale

Milvus is an open-source vector database designed for large-scale deployments. Its managed version, Zilliz Cloud, offers enterprise features and GPU-accelerated search.

Key features:

  • GPU-accelerated indexing and search (via Zilliz Cloud)
  • Supports billions of vectors across distributed clusters
  • Multiple index types (IVF, HNSW, DiskANN, GPU indexes)
  • Schema enforcement with typed fields
  • Partition keys for multi-tenant data isolation

Best for:

  • Enterprise workloads at billions of vectors
  • Teams that need GPU-accelerated search for low-latency at massive scale
  • Applications requiring multiple index types tuned for different query patterns
  • Organizations with dedicated infrastructure teams

Limitations:

  • Complex to self-host. Requires etcd, MinIO (or S3), and message queues in distributed mode
  • Steep learning curve compared to simpler options
  • Overkill for most workloads under millions of vectors
  • Resource-intensive even in standalone mode

Example:

import { MilvusClient } from "@zilliz/milvus2-sdk-node";

const client = new MilvusClient({ address: "localhost:19530" });

await client.createCollection({
  collection_name: "documents",
  fields: [
    { name: "id", data_type: 5, is_primary_key: true, auto_id: true },
    { name: "embedding", data_type: 101, type_params: { dim: 1536 } },
    { name: "category", data_type: 21, max_length: 256 },
  ],
});

const results = await client.search({
  collection_name: "documents",
  vector: queryEmbedding,
  limit: 10,
});

Milvus is the heavy-duty option. If you're processing billions of vectors and need GPU acceleration or advanced indexing, it's built for that. For most teams, it's more infrastructure than the workload requires.

Chroma: Best for Prototyping and Local Development

Chroma is an open-source embedding database focused on developer experience. It runs in-process (embedded) or as a client-server setup, making it the fastest path from zero to a working vector search.

Key features:

  • Embeds directly in your Python or JavaScript process, no server to run
  • Automatic embedding generation with pluggable models
  • Simple API: add, query, update, delete
  • Persistent storage to disk
  • Also runs as a standalone server for production

Best for:

  • Prototyping and experimentation
  • Local development and testing of AI features
  • Small datasets (thousands to hundreds of thousands of vectors)
  • Educational projects and proof of concepts

Limitations:

  • Performance degrades above hundreds of thousands of vectors
  • Limited production deployment story (no managed cloud offering yet)
  • No built-in replication or horizontal scaling
  • JavaScript/TypeScript client is less mature than the Python client

Example:

import { ChromaClient } from "chromadb";

const client = new ChromaClient();
const collection = await client.createCollection({ name: "documents" });

// Add with auto-generated embeddings
await collection.add({
  ids: ["doc-1"],
  documents: ["pgvector adds vector search to PostgreSQL..."],
  metadatas: [{ category: "database" }],
});

// Query
const results = await collection.query({
  queryTexts: ["how does vector search work"],
  nResults: 10,
});

Chroma is great for getting started and testing ideas. For production workloads, you'll likely graduate to pgvector (if you want simplicity) or a dedicated vector database (if you need scale).

LanceDB: Best Embedded Vector Database

LanceDB is an open-source embedded vector database built on the Lance columnar format. It runs in-process with zero-copy access to data, making it fast for local workloads without a running server.

Key features:

  • Zero-copy, columnar storage based on the Lance format
  • In-process operation, no separate server required
  • Supports disk-based indexing (IVF-PQ) for datasets larger than RAM
  • Automatic versioning of data
  • Serverless cloud offering in beta

Best for:

  • Applications where an embedded database is preferred (edge, desktop, local-first)
  • Data science workflows that need fast iteration without a server
  • Workloads where disk-based indexing enables larger-than-memory datasets
  • Teams already using the Arrow/Lance ecosystem

Limitations:

  • Relatively new, with a smaller community and fewer production deployments
  • Cloud offering is still in beta
  • Multi-process concurrent access has limitations
  • Ecosystem and integrations are still maturing

Example:

import * as lancedb from "lancedb";

const db = await lancedb.connect("data/lancedb");

const table = await db.createTable("documents", [
  { id: 1, text: "pgvector adds vector search...", vector: embedding },
]);

const results = await table.search(queryEmbedding).limit(10).toArray();

LanceDB is worth watching if you're building local-first applications or working in data science workflows where running a server is overhead. For backend services, pgvector or a managed option is usually a better fit.

How to Choose

Start with pgvector if:

You already run PostgreSQL (most backend teams do). pgvector adds vector search without adding infrastructure. Documents and embeddings live in the same table, in the same transaction. You use SQL for filtering. There's no sync pipeline, no extra credentials, no new service to monitor. For workloads under 5 million vectors, performance is more than adequate.

Add a dedicated vector database if:

Your workload exceeds what a single Postgres instance handles comfortably (hundreds of millions of vectors, high concurrent query throughput, or requirements for auto-scaling and per-tenant isolation). Choose between:

  • Pinecone if you want fully managed with zero ops
  • Qdrant if you want open-source with strong performance and payload filtering
  • Milvus if you need enterprise scale with GPU acceleration
  • Weaviate if you want built-in vectorization and hybrid search

Use an embedded database if:

You're prototyping, building local-first, or don't want to run a server. Chroma for the simplest API and getting started fast. LanceDB for larger-than-memory datasets with disk-based indexing.

Decision Matrix

ConsiderationRecommended Option
Already running Postgres, <5M vectorspgvector
Zero operational overhead, any scalePinecone
Open-source, dedicated, self-hostedQdrant
Built-in vectorization, hybrid searchWeaviate
Billions of vectors, enterpriseMilvus
Prototyping, local dev, learningChroma
Embedded, local-first, edgeLanceDB

Getting Started with pgvector

For most teams adding AI features to an existing backend, pgvector is the simplest path. You avoid a separate service, get transactional consistency, and keep the SQL tooling you already know.

If you're building a TypeScript backend, Encore provisions PostgreSQL with pgvector automatically. Databases are declared in code, migrations run on startup, and the same setup works locally and in production:

import { api } from "encore.dev/api";
import { SQLDatabase } from "encore.dev/storage/sqldb";

const db = new SQLDatabase("search", {
  migrations: "./migrations",
});

export const search = api(
  { expose: true, method: "POST", path: "/search" },
  async (req: { query: string }): Promise<{ results: SearchResult[] }> => {
    const embedding = await generateEmbedding(req.query);

    const rows = await db.query<SearchResult>`
      SELECT id, title, 1 - (embedding <=> ${embedding}::vector) AS similarity
      FROM documents
      ORDER BY embedding <=> ${embedding}::vector
      LIMIT 5
    `;

    const results: SearchResult[] = [];
    for await (const row of rows) {
      results.push(row);
    }

    return { results };
  }
);

For a step-by-step tutorial building a complete RAG pipeline, see How to Build a RAG Pipeline with TypeScript. For a head-to-head comparison with the most popular managed alternative, see pgvector vs Pinecone.


Have questions about choosing a vector database? Join our Discord community where developers discuss architecture decisions daily.

Ready to build your next backend?

Encore is the Open Source framework for building robust type-safe distributed systems with declarative infrastructure.