Vector Database Comparison 2026: pgvector, Pinecone, Qdrant, More

The vector database space has matured. Two years ago it was Pinecone vs. Weaviate and everyone was building their own. Today there are five or six credible options, and "just use pgvector" is increasingly the right answer.

This is how we pick, with actual benchmarks and cost comparisons, based on engagements we've run over the past year.

The short answer

Default to pgvector on Postgres. It's enough for 80% of RAG systems up to ~10M vectors and will reduce operational complexity.
Reach for Qdrant or Weaviate (self-hosted) when you exceed pgvector's performance or feature ceiling.
Reach for Pinecone when you want zero ops and have budget.
Reach for Vespa for hybrid search at massive scale.
Don't build your own. You have better problems to solve.

Full analysis below.

What actually matters in a vector DB

Before comparing, be clear what you're evaluating:

Query performance at your scale — latency at p95, throughput per second
Recall — are you getting the actually-closest vectors, or approximations with drops?
Filter performance — how fast is "vectors near X where tenant_id = Y"?
Hybrid search — combining vector similarity with keyword (BM25) matching
Cost — fully-loaded, including compute, storage, and ops
Operational complexity — how hard is it to run, backup, upgrade?
Integration — does it fit your existing stack?
Multi-tenancy — per-tenant isolation, if you need it
Metadata filtering — pre-filter vs post-filter, complexity of queries

Most "best vector DB" comparisons only measure #1. In practice, #5 and #6 drive the decision.

The contenders

pgvector (Postgres extension)

What it is: An extension adding vector columns and ANN indexes (HNSW and IVFFlat) to Postgres.

Strengths:

You already run Postgres. Operationally invisible.
Full SQL for filtering — arbitrary WHERE clauses on metadata work naturally.
Transactional consistency with the rest of your data.
Cheap at small to medium scale.
Mature ecosystem (pgvector is now 4+ years old, in active development).

Weaknesses:

Performance degrades with very large datasets (>20M vectors gets tricky).
Index builds can be slow on large tables.
Less optimized than purpose-built vector DBs for pure vector workloads.

Our take: Start here. You'll probably never leave.

Qdrant (open source + cloud)

What it is: Purpose-built vector DB written in Rust. Open-source with a managed cloud.

Strengths:

Very fast. Among the best performance/$ ratios.
Excellent filter performance (payload indexing separate from vector indexes).
Good multi-tenancy primitives.
Rich Python/TS clients.
Self-hostable via Docker, k8s helm chart.

Weaknesses:

Smaller ecosystem than Pinecone.
Self-hosted requires real Kubernetes knowledge at scale.
Managed cloud is newer than Pinecone's offering.

Our take: Best open-source option in 2026. Qdrant Cloud is a solid managed choice.

Weaviate (open source + cloud)

What it is: Go-based vector DB with a schema-first design, built-in hybrid search, and strong module ecosystem.

Strengths:

Excellent hybrid search out of the box (BM25 + vector).
Built-in modules for generating embeddings (no separate embedding service needed).
Good GraphQL and REST APIs.
Mature managed cloud (WCS).

Weaknesses:

More opinionated schema model — feels heavier than Qdrant for simple cases.
Performance is solid but not class-leading.
Memory-hungry at scale.

Our take: Strong choice if you need hybrid search and like the schema-first approach. Slight edge over Qdrant for hybrid-heavy use cases.

Pinecone (managed only)

What it is: The original purpose-built vector DB cloud service. Fully managed, serverless option available.

Strengths:

Zero ops. Truly set-and-forget.
Fast, with consistent latency SLAs.
Mature ecosystem and SDK.
Serverless pricing model (v2) is genuinely competitive.

Weaknesses:

Closed-source. Vendor lock-in is real.
Historically expensive (v2 serverless helps).
No self-hosted option.
Schema flexibility is limited vs. open-source alternatives.

Our take: The right choice if ops time is your bottleneck and you have budget. Otherwise, one of the open-source options with a managed tier wins.

Vespa (open source)

What it is: Yahoo's production search/ranking engine, open-sourced. Handles hybrid search, ML ranking, and vector search at massive scale.

Strengths:

Battle-tested at billions of vectors, billions of queries per day.
Native ML ranking integration (tensor evaluation, phased ranking).
Hybrid search is a first-class citizen.
Flexible document model.

Weaknesses:

Steep learning curve. Custom application packages, XML config.
Operational complexity is real.
Overkill for most use cases.

Our take: Only pick Vespa if you have serious search needs (complex ranking, true hybrid, >100M vectors) and a team capable of running it.

Honorable mentions

Milvus / Zilliz — solid, widely used, especially in Asia. Cloud offering is Zilliz. Good choice; we've had more engagements with Qdrant and Weaviate lately so less recent experience here.
Elasticsearch / OpenSearch — vector support is decent now; worth considering if you already run it for lexical search.
Chroma — popular in early-stage prototyping but not yet a production choice for us. Watch this space.
LanceDB — embedded (SQLite-style) vector DB. Interesting for edge / local use.

A concrete benchmark

We ran this benchmark on the MS MARCO dataset (1M passages, 384-dim embeddings) on a single machine (c7g.4xlarge, 16 vCPU, 32GB RAM):

System	Index build time	p50 query latency	p95 query latency	QPS @ 4 clients
pgvector (HNSW)	48 min	8 ms	22 ms	480
Qdrant	19 min	3 ms	9 ms	1,200
Weaviate	26 min	5 ms	14 ms	850
Pinecone (p2.x1)	N/A (managed)	12 ms	35 ms	600
Vespa	41 min	4 ms	11 ms	1,050

All targeted recall@10 ≥ 0.95. Pinecone latency includes network round-trip from a same-region EC2 client.

At this scale (1M vectors), all are viable. pgvector is the slowest but fast enough for most applications.

⚠️Benchmark caveats

Benchmarks are sensitive to dataset, query patterns, hardware, and tuning. Run your own on a representative workload before committing. These numbers are directional.

Cost comparison (50M vectors, 768-dim)

At 50M vectors (typical for a medium-sized RAG over full-text document corpus):

System	Approximate monthly cost
pgvector on RDS (db.r6g.4xlarge + 500GB gp3)	$1,200
Qdrant self-hosted (3× m6g.2xlarge + EBS)	$850
Qdrant Cloud (dedicated, similar sizing)	$1,400
Weaviate self-hosted	$900
Weaviate Cloud (serverless)	$1,100
Pinecone (serverless)	$900–1,800 depending on traffic
Vespa self-hosted	$950

Self-hosted options have lower direct costs but add ops time. Managed options have higher direct cost but save engineering hours.

💡The real cost is engineering time

A DevOps engineer costs $200k+/year loaded. If a managed service saves 4 hours/week of ops, that's $20k/year — pays for most managed offerings at medium scale.

Decision matrix

If you...	Pick
Already run Postgres, have < 10M vectors	pgvector
Need best raw performance, self-hostable	Qdrant
Need strong hybrid search (vector + BM25)	Weaviate or Vespa
Want zero ops, have budget	Pinecone (serverless)
Have massive scale (100M+ vectors, complex ranking)	Vespa
Need SQL-level filtering flexibility	pgvector
Are building a proof-of-concept	pgvector or Qdrant
Have a multi-tenant SaaS (per-tenant isolation)	Qdrant or Pinecone (namespaces)

Don't migrate prematurely

The most common mistake we see: teams migrate from pgvector to a "real" vector DB because it's the trendy architecture. Usually they're at 1M vectors and pgvector is fine.

Signs you should migrate off pgvector:

p95 vector query latency > 200ms with proper HNSW tuning
Index build times are blocking development
You're fighting Postgres planner to get consistent performance
You're exceeding 50M vectors and growing fast

Signs you should stay on pgvector:

Queries are fast enough
You value transactional consistency with the rest of your data
Ops complexity is your bottleneck
You're spending more time picking a vector DB than shipping features

Implementation tips regardless of choice

Chunk carefully. The right chunk size matters more than the database. 256-512 tokens with 10-20% overlap is a solid default; tune based on your data.
Hybrid search almost always helps. Pure vector search misses keyword matches humans expect. BM25 + vector with a re-ranker is the modern pattern.
Re-ranking improves quality cheaply. A cross-encoder re-ranker on the top 50 results often gives a bigger quality boost than switching vector DBs.
Test on real queries. Synthetic benchmarks lie. Build an evaluation set from real user queries and measure recall@K on it.
Metadata filters are where real systems live or die. "Most similar vectors" is rarely enough. "Most similar within this user's documents from last 30 days" is typical. Evaluate filter performance seriously.

Closing

Most teams spend too much time picking a vector DB and not enough time on retrieval quality. The infrastructure choice matters less than the chunking strategy, the embedding model, the re-ranker, and the evaluation harness.

pgvector for as long as you can, then Qdrant or Weaviate when you can't. Pinecone if ops is your bottleneck. Vespa if you're operating at Yahoo scale.

30-day implementation checklist

If you need to move from analysis to execution quickly:

Baseline current query latency and recall on real traffic.
Build a representative 100-200 query evaluation set.
Test at least two contenders against your real filters and tenancy model.
Compare total cost including operations overhead, not infrastructure only.
Launch with clear migration rollback and quality gates.

The winning choice is the one your team can operate reliably while meeting product-level latency and quality targets.

Capabilities: AI-Native Products and Data Platform
Case study: LegalTech RAG system
Deep dive: RAG evaluation tests before shipping

Anthra AI Team

Engineering Team

Collective posts from the engineers at Anthra AI. We write about what we build.

Share this article

Get insights like this weekly

Product engineering notes on AI, data, and infrastructure - no fluff.

Cloud cost: a checklist before your next AWS bill surprise

Infra

Event schema design: what every product team gets wrong

Product Analytics

Choosing a vector database in 2026: a practical comparison

The short answer

What actually matters in a vector DB

The contenders

pgvector (Postgres extension)

Qdrant (open source + cloud)

Weaviate (open source + cloud)

Pinecone (managed only)

Vespa (open source)

Honorable mentions

A concrete benchmark

Cost comparison (50M vectors, 768-dim)

Decision matrix

Don't migrate prematurely

Implementation tips regardless of choice

Closing

30-day implementation checklist

Tags

Anthra AI Team

Share this article

Get insights like this weekly

Related posts

Fine-tuning LLMs in 2026: when it's worth the effort

RAG evaluation: the tests we run before shipping any LLM feature

The LLM eval harness we wish we'd built sooner

Need help building this?

The short answer

What actually matters in a vector DB

The contenders

pgvector (Postgres extension)

Qdrant (open source + cloud)

Weaviate (open source + cloud)

Pinecone (managed only)

Vespa (open source)

Honorable mentions

A concrete benchmark

Cost comparison (50M vectors, 768-dim)

Decision matrix

Don't migrate prematurely

Implementation tips regardless of choice

Closing

30-day implementation checklist

Related resources

Tags

Anthra AI Team

Share this article

Get insights like this weekly

Related posts

Fine-tuning LLMs in 2026: when it's worth the effort

RAG evaluation: the tests we run before shipping any LLM feature

The LLM eval harness we wish we'd built sooner

Need help building this?