03/09/26

pgvector vs Pinecone in 2026

When your Postgres can already do vector search, do you need a dedicated service?

10 Min Read

You're adding semantic search, a recommendation engine, or a RAG pipeline to your backend. You need somewhere to store and query vector embeddings. The two most common choices: extend the PostgreSQL you already run with pgvector, or add Pinecone as a managed vector database alongside it.

Both store vectors and return nearest neighbors. The difference is operational. pgvector keeps everything in one database. Pinecone is a separate service optimized for vector workloads at massive scale. The right choice depends on how many vectors you're working with, how much operational overhead you're willing to take on, and whether you need features that only a dedicated system provides.

Quick Comparison

AspectpgvectorPinecone
What it isPostgreSQL extensionManaged vector database (SaaS)
SetupCREATE EXTENSION vectorAPI key + SDK
InfrastructureYour existing PostgresPinecone's cloud
Index typesHNSW, IVFFlatProprietary (graph-based, built on FreshDiskANN)
Max vectorsMillions (limited by instance RAM)Billions (serverless scales automatically)
Query latency5-50ms (depends on dataset + index)10-50ms (p50 single-digit at scale)
ConsistencyTransactional (same DB as your data)Eventually consistent
FilteringSQL WHERE clausesMetadata filtering (proprietary syntax)
Cost modelPart of your Postgres billPer-query + storage + dimensions
Open sourceYes (PostgreSQL license)No

Setup and Getting Started

pgvector

pgvector is an extension you enable on an existing PostgreSQL instance. If you're already running Postgres, it's one migration:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  embedding vector(1536)
);

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

Your embeddings live in the same table as your application data. You query them with SQL. Any tool that talks to Postgres (ORMs, migration frameworks, monitoring) works without changes.

Most managed Postgres providers (AWS RDS, Google Cloud SQL, Supabase, Neon) support pgvector out of the box. If you self-host, you install the extension and run the migration.

Pinecone

Pinecone is a standalone service. You sign up, create an index, and interact through their SDK:

import { Pinecone } from "@pinecone-database/pinecone";

const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });

const index = pc.index("documents");

// Upsert vectors
await index.upsert([
  {
    id: "doc-1",
    values: embedding, // float array
    metadata: { title: "Getting started", category: "docs" },
  },
]);

// Query
const results = await index.query({
  vector: queryEmbedding,
  topK: 5,
  includeMetadata: true,
});

The index is managed for you. No Postgres, no extensions, no instance sizing. Pinecone handles replication, scaling, and index optimization.

The tradeoff: your documents live in Postgres and your embeddings live in Pinecone. You maintain two data stores for what is logically one entity.

Query Performance

For workloads under a million vectors, both return results in single-digit to low double-digit milliseconds. The performance difference between them is smaller than the latency of the embedding API call that precedes the search, and far smaller than the LLM generation that follows it in a RAG pipeline.

pgvector with an HNSW index queries 1M vectors in 5-20ms at 95%+ recall. Performance depends on your Postgres instance size (specifically available RAM for the index), the number of dimensions, and your HNSW build parameters (m and ef_construction). Tuning these is straightforward, and the pgvector documentation covers the tradeoffs.

Pinecone reports single-digit millisecond p50 latency on their serverless tier for typical workloads. Their proprietary index is optimized for vector operations, and they handle the tuning automatically. At scale, Pinecone's advantage is that latency stays consistent as you add more vectors. The serverless architecture scales the index transparently.

Where the gap widens: above 5-10 million vectors, pgvector requires careful tuning of shared_buffers, work_mem, and HNSW parameters to maintain fast queries. Pinecone handles this automatically. At hundreds of millions or billions of vectors, pgvector hits practical limits tied to your instance's memory, while Pinecone is designed for that scale.

For the workloads most teams actually have (documentation search, support ticket classification, product recommendations, internal knowledge bases), both perform well and the choice should be driven by other factors.

Scaling

pgvector scales with your Postgres instance. More RAM means a larger index fits in memory, which means faster queries. Vertical scaling is the primary path: a larger RDS instance or Cloud SQL machine. You can also partition tables by tenant or category to keep individual indexes smaller. But pgvector doesn't offer horizontal scaling for vector search. You can't shard an HNSW index across multiple Postgres instances without significant application-level work.

Pinecone scales horizontally by design. Their serverless tier auto-scales based on query volume and data size. You don't manage instances, replicas, or shards. For multi-tenant applications where each tenant's data grows independently, Pinecone's per-namespace isolation and automatic scaling simplifies the architecture significantly.

If your dataset is under 5 million vectors and query volume is moderate (hundreds of QPS), pgvector on a properly sized Postgres instance handles it comfortably. Beyond that, Pinecone's managed scaling starts to justify the added complexity and cost.

Cost

pgvector adds no cost beyond your existing Postgres infrastructure. The vectors are stored as columns in regular tables. If your Postgres instance already has headroom, vector search is essentially free. If you need a larger instance to fit the HNSW index in memory, the cost increase is your Postgres provider's pricing for the next instance tier, typically $50-200/month more.

Pinecone has its own pricing model. The serverless tier charges based on read units, write units, and storage. A free Starter plan covers up to 100K vectors. Beyond that, the Standard tier starts at $50/month. For moderate workloads (1-5M vectors, thousands of queries per day), expect $100-500/month. At scale, costs can reach thousands per month depending on query volume and dimensions.

The hidden cost with Pinecone is operational, not financial. You maintain two data stores, two sets of credentials, sync logic between them, and monitoring for both. When an embedding is missing from Pinecone but the document exists in Postgres, you need a reconciliation process. When you delete a document, you delete from two places. These are solvable problems, but they add ongoing engineering time.

With pgvector, a single INSERT INTO documents (title, content, embedding) VALUES (...) writes both the document and its vector atomically. There's nothing to sync because there's nothing to sync.

Transactional Consistency

This is pgvector's strongest advantage.

With pgvector, your documents and their embeddings live in the same Postgres table, in the same transaction. When you insert a document with its embedding, either both are written or neither is. When you update or delete a document, the embedding changes in the same transaction. Your search results are always consistent with your application state.

With Pinecone, writes are eventually consistent. When you insert a document in Postgres and its embedding in Pinecone, there's a window where one exists without the other. If the Pinecone write fails, you have a document with no embedding. If you delete from Postgres but the Pinecone delete fails, you have an orphaned vector that returns in search results pointing to a document that no longer exists.

For many applications, eventual consistency is fine. A few seconds of delay before a new document appears in search results is acceptable. But for applications where consistency matters (financial document search, compliance workflows, real-time inventory), the transactional guarantee of pgvector is a significant advantage.

Metadata Filtering

pgvector uses standard SQL for filtering. You can filter on any column in the table, use joins, subqueries, CTEs, and anything else Postgres supports:

SELECT id, title, 1 - (embedding <=> $1) AS similarity
FROM documents
WHERE category = 'engineering'
  AND created_at > NOW() - INTERVAL '30 days'
  AND author_id IN (SELECT id FROM authors WHERE team = 'backend')
ORDER BY embedding <=> $1
LIMIT 10;

The query planner combines the vector similarity search with the relational filters. You get the full power of SQL, which you already know.

Pinecone supports metadata filtering with its own query syntax:

const results = await index.query({
  vector: queryEmbedding,
  topK: 10,
  filter: {
    category: { $eq: "engineering" },
    created_at: { $gt: "2026-02-01" },
  },
});

Pinecone's filtering works well for straightforward metadata filters. But you can't join against other tables, run subqueries, or use the rich filtering capabilities that SQL provides. If your filtering needs are simple (filter by category, date range, tenant ID), Pinecone handles it fine. If they're complex, pgvector's SQL integration is a clear win.

Managed vs Self-Run

Pinecone is fully managed. You don't provision instances, configure indexes, tune parameters, or handle failover. Their team manages the infrastructure, and you get an API key. For teams without database expertise or teams that want to focus entirely on application code, this is a legitimate advantage.

pgvector runs inside Postgres, which means the management story depends on how you run Postgres. On a managed service like RDS or Cloud SQL, you get automated backups, failover, and patching, but you still size the instance, tune parameters for vector workloads, and monitor memory usage as the index grows. Self-hosted Postgres adds the full operational burden.

If you use a framework that provisions databases automatically, the management overhead drops significantly. With Encore, for example, databases are declared in application code and provisioned automatically, locally for development and on your own AWS or GCP account for production:

import { SQLDatabase } from "encore.dev/storage/sqldb";

const db = new SQLDatabase("search", {
  migrations: "./migrations",
});

The migration enables pgvector, creates tables and indexes, and runs automatically. No manual provisioning in production, no connection strings to manage. The database is a code declaration, not an infrastructure ticket.

When to Choose pgvector

pgvector is the better choice when:

  • You already run Postgres. Adding vector search is a migration, not a new service. No new credentials, no new monitoring, no sync logic.
  • Your dataset is under 5 million vectors. HNSW indexes on a properly sized instance handle this comfortably with single-digit millisecond queries.
  • Transactional consistency matters. Documents and embeddings in the same transaction means your search results are always accurate.
  • You need complex filtering. SQL joins, subqueries, and the full Postgres query planner beat any proprietary filter syntax.
  • You want to keep your architecture simple. One database instead of two. One backup strategy. One set of access controls.
  • Cost is a factor. pgvector adds no cost beyond your existing Postgres infrastructure.

If you want to try it, see Getting Started with pgvector below for a working setup in a few lines of code.

When to Choose Pinecone

Pinecone is the better choice when:

  • You have hundreds of millions or billions of vectors. Pinecone's architecture is built for this scale. pgvector on a single instance isn't.
  • You need auto-scaling without tuning. Pinecone's serverless tier handles variable query volume and growing datasets without manual intervention.
  • You want fully managed vector infrastructure. No Postgres tuning, no index parameter optimization, no memory monitoring.
  • You're building a multi-tenant application at scale. Pinecone's namespace isolation and automatic scaling per tenant simplifies the architecture.
  • Your team doesn't have Postgres expertise. Pinecone's API is straightforward, and the operational burden is near zero.

Getting Started with pgvector

For most teams adding vector search to an existing backend, pgvector is the simpler path. You keep your existing database, avoid sync complexity, and get transactional consistency for free.

If you're building a TypeScript backend, Encore provisions Postgres with pgvector automatically. Define a database in code, write a migration to enable the extension, and vector search works locally and in production without infrastructure configuration:

import { api } from "encore.dev/api";
import { SQLDatabase } from "encore.dev/storage/sqldb";

const db = new SQLDatabase("search", {
  migrations: "./migrations",
});

export const search = api(
  { expose: true, method: "POST", path: "/search" },
  async (req: { query: string; limit?: number }): Promise<{ results: SearchResult[] }> => {
    const embedding = await generateEmbedding(req.query);

    const rows = await db.query<SearchResult>`
      SELECT id, title, content,
             1 - (embedding <=> ${embedding}::vector) AS similarity
      FROM documents
      ORDER BY embedding <=> ${embedding}::vector
      LIMIT ${req.limit ?? 5}
    `;

    const results: SearchResult[] = [];
    for await (const row of rows) {
      results.push(row);
    }

    return { results };
  }
);

For a full walkthrough building a RAG pipeline with pgvector and Encore, see How to Build a RAG Pipeline with TypeScript. For a broader comparison of vector database options, see Best Vector Databases in 2026.

If your workload does require a dedicated vector database (billions of vectors, auto-scaling multi-tenant search), Pinecone is a solid choice. But start with pgvector and move to Pinecone if and when you outgrow it. You might be surprised how far Postgres takes you.


Have questions about vector search architecture? Join our Discord community where developers discuss infrastructure decisions daily.

Ready to build your next backend?

Encore is the Open Source framework for building robust type-safe distributed systems with declarative infrastructure.