Skip to main content
aibizhub

Tighter Guide · 8 min · 5 citations

Embeddings DB Cost at 200k Vectors: pgvector vs Pinecone

Embeddings DB cost at 200k 1,536-dim vectors with 2,400 daily queries across pgvector, Pinecone, and Weaviate. Storage class swings cost by 6x.

By Orbyd Editorial · Published May 21, 2026

Education · General business information, not legal, tax, or financial advice. Editorial standards Sponsor disclosure Corrections

TL;DR

At 200,000 1,536-dimensional vectors with 2,400 queries per day and 1,500 ingests per day on a 365-day retention window, the Embeddings DB Cost engine returns: Pinecone $0.68/mo, Postgres with pgvector $35/mo, LanceDB on Cloudflare R2 $0.55/mo, and Turbopuffer $0.24/mo. The storage footprint at this scale is roughly 1.4 GB.

The reason the spread is small at this volume is that storage dominates and storage rates are similar across modern vendors. The interesting cost differences appear when retention windows or query volumes change, not at the headline 200k volume. Cutting retention from 365 to 60 days reduces stored vectors by about 6x; raising QPS to 50+ flips the answer toward Pinecone or Turbopuffer over pgvector.

At 200k vectors the cost gap is too small to decide on, so pick on operations, not price: Pinecone runs $0.68/month against pgvector's $35/month, a spread that barely registers next to your time. The honest comparison turns on your retention window, your query volume, and whether you already operate Postgres for unrelated workloads, not the headline cost most founders expect to be decisive. This article runs the calculator on a concrete 200k-vector scenario and works through the cost levers that actually matter at solo-founder scale.

1. The 200k vector scenario, priced literally

Inputs: 200,000 vectors at 1,536 dimensions (the OpenAI text-embedding-3-small output dimension)[3], 2,400 queries per day, 1,500 ingests per day, 365-day retention. Storage footprint at 4 bytes per dimension is 1.23 GB raw, plus per-vector metadata and HNSW index overhead, landing near 1.4 GB stored.

RAG product: 200k vectors, 1,536 dim, 2,400 queries/day, 365-day retention
# embeddings-db-cost (computed live from /engines/embeddings-db-cost.js)
Engine input
  vector_count          = 200000
  dim                   = 1536
  queries_per_day       = 2400
  ingest_per_day        = 1500
  retention_days        = 365

Engine output
  vendors[0].vendor     = Pinecone
  vendors[0].monthlyCost= 0.68
  vendors[0].notes      = Pinecone Standard list pricing 2026-04: $0.33/M reads, $4/M writes, $0.33/GB-mo storage.
  vendors[1].vendor     = Postgres+pgvector
  vendors[1].monthlyCost= 35
  vendors[1].notes      = DigitalOcean managed Postgres baseline ($35/mo, includes 25GB; $0.20/GB-mo overage). Self-hosted equivalent.
  vendors[2].vendor     = LanceDB
  vendors[2].monthlyCost= 0.55
  vendors[2].notes      = LanceDB on Cloudflare R2 list pricing 2026-04: $0.015/GB-mo, $4.50/M ops. Self-hosted compute not included.
  vendors[3].vendor     = Turbopuffer
  vendors[3].monthlyCost= 0.24
  vendors[3].notes      = Turbopuffer list pricing 2026-04: $0.10/GB-mo, $0.04/M reads, $2/M writes.
  cheapestVendor        = Turbopuffer
  cheapestMonthlyCost   = 0.24
  storageGb             = 1.43

The engine returns four vendor prices. Pinecone Standard at $0.68/mo, derived from $0.33 per million reads (~$0.02/mo at 2,400 queries/day for a 30-day month), $4 per million writes (~$0.18/mo at 1,500 ingests/day), and $0.33 per GB-month for storage (~$0.47 at 1.4 GB)[1]. Postgres with pgvector at $35/mo, derived from DigitalOcean managed Postgres baseline (25 GB included, no overage at this volume).

LanceDB on Cloudflare R2 at $0.55/mo, derived from R2 storage at $0.015 per GB-month and $4.50 per million operations. Turbopuffer at $0.24/mo, derived from $0.10 per GB-month storage and $0.04 per million reads[4]. Embedding API calls (the actual OpenAI embedding cost) are not included in any of these — that is a separate ~$3/mo line item at this volume.

2. Storage class dominates query volume below 100 QPS

At 2,400 queries per day (roughly 0.03 QPS sustained, with realistic peak around 0.5 QPS), every modern vendor is essentially free on read pricing. The Pinecone read cost is 2 cents per month. The Turbopuffer read cost is 0.3 cents per month. The cost differential is entirely on storage rates, which is why the four-vendor spread compresses so tightly at small scale.

Where this breaks down is at 50 to 100 QPS sustained. At 100 QPS sustained, monthly reads are 259 million, which on Pinecone is $86 per month in read costs alone, on Turbopuffer is $10 per month, and on pgvector is bounded only by the database CPU (which means either the existing Postgres tier handles it or it needs upgrading). The vendor cost curves diverge by an order of magnitude once read volume becomes meaningful.

For solo founders building RAG-backed products, the practical implication is that current cost rankings barely matter if your product is pre-traction. The 200k-vector scenario sits at $0.20 to $35/mo across all options. Pick the one with the lowest operational burden (no servers, fastest provisioning) and revisit when query volume hits 10+ QPS sustained.

3. Retention window is the 6x cost lever

The biggest lever in the calculator is retention days. At a steady-state ingest of 1,500 per day, a 365-day retention window stores 547,500 vectors after the first year. A 60-day window stores 90,000 vectors. A 30-day window stores 45,000. Storage cost scales linearly with vector count, so the 365-day plan is roughly 6x more expensive than the 60-day plan on storage alone.

Most RAG stacks ship with retention as "forever" because nobody designed the retention policy. This is the largest cost-saving opportunity in the embeddings layer, and it is almost always available without product impact. Embeddings older than 90 days are usually re-derivable from the source documents at negligible additional cost; the embedding step is cheap, the storage is what compounds. The token-cost optimization playbook covers the related compute side.

The right design pattern: retain the latest N days hot, age out older embeddings to cold object storage (R2, S3 Glacier) or delete them entirely with a re-embed-on-demand fallback. The cost savings at 12-month-out scale are usually 50% to 80% of the original embeddings bill, with no product-visible behavior change.

4. When pgvector wins on total cost

The $35/mo pgvector number in the calculator looks expensive in isolation. It is not expensive if you are already paying for Postgres for application data. In that case, the marginal cost of adding pgvector is approximately zero — the existing database handles vectors as additional columns and rows, the only new cost is the storage GB and the HNSW index memory.

Three scenarios where pgvector wins:

  • You already operate Postgres at the right tier. Marginal cost of adding pgvector is ~$0 up to the storage cap. The vendor cost is sunk.
  • You need transactional consistency between vectors and source records. Same database, same transaction, no synchronization lag between application data and embeddings.
  • You need to filter by application data before vector search. Postgres lets you SQL-filter the candidate set, then run vector search on the filtered subset. Most managed vector DBs offer this via metadata filtering, but with less expressive query power than full SQL.

Three scenarios where pgvector loses:

  • You do not already run Postgres. Adding a $35/mo database just for vectors is the worst-cost option at solo scale.
  • Query volume exceeds 50 QPS sustained on a single database. Postgres + pgvector at this scale needs a dedicated read replica or a sharded architecture; the operational complexity overtakes the cost saving.
  • You need HNSW with large recall guarantees on high-dimensional vectors. Managed vendors tune the index parameters and infrastructure; pgvector requires you to do this yourself.

5. When managed vendors (Pinecone, Turbopuffer) win

Pinecone, Turbopuffer, and similar managed-vector products win on three axes: provisioning time (minutes vs hours), operational burden (zero vs ongoing), and elastic scale (transparent vs you-have-to-resize). The vendor cost at solo scale is essentially the operations-team cost they save you, expressed as a monthly bill.

The honest version of the choice: if you are spending 2 hours per month tuning a Postgres + pgvector index, that is $176 per month of opportunity cost at a $88/hour loaded rate, which already exceeds the Pinecone or Turbopuffer bill at this scale. The vendor wins on time, not on raw infrastructure cost. The build vs buy framework covers this trade-off across the broader stack.

Turbopuffer is the lowest-cost managed option at small scale ($0.24/mo in the worked scenario) because its pricing is aggressive on cold storage. Pinecone Standard's $0.68/mo is competitive and the brand has the most mature ecosystem (integrations, SDKs, documentation). For a brand-new RAG product where the founder writes the code and does the integration once, the spread between these two is small enough that the choice can be made on developer experience rather than pricing.

6. Migration cost between vendors

The embeddings layer is the most migration-friendly part of a RAG stack. Vector dimensions are fixed by the embedding model, not by the vector DB. Switching from Pinecone to Turbopuffer is a re-ingest plus a query-code change. Estimated 1 to 3 days of engineering for a solo-founder codebase, with no model retraining and no data quality regression.

This is the reason vendor lock-in is small at this layer. The cost of switching is bounded, the embeddings themselves are portable, and the query API differences across vendors are shallow. The implication: pick the cheapest vendor that meets your operational requirements today, plan to re-run this comparison every six months as pricing evolves, and treat the choice as reversible. The vendor lock-in math article covers the related model-vendor decision, where switching costs are materially higher.

7. The decision rule for solo-founder RAG stacks

The decision rule, simplified to three branches:

  1. You already run Postgres at Pro tier or above: use pgvector. Marginal cost is near zero, operational surface is one fewer vendor, and SQL-filter-then-vector-search is a real query-expressiveness win.
  2. You do not run Postgres and query volume is under 10 QPS sustained: use Turbopuffer or Pinecone serverless. Vendor cost is under $5/mo at this scale, provisioning is minutes, and operational burden is zero.
  3. You expect 50+ QPS sustained or you need premium index tuning: use Pinecone Standard or a dedicated Turbopuffer plan. The infrastructure-managed-by-vendor saves real engineering time at this scale.

One final lever: retention discipline. Whatever vendor you pick, set a retention policy on day one. Hot-storage 60 to 90 days, cold or delete beyond. The annual storage bill is the line that creeps without anyone noticing until it is a meaningful number. The methodology for the underlying cost model is documented at the Embeddings DB Cost methodology page[5].

The 2026 AI solopreneur stack article slots embeddings into the broader stack picture, and the AI Stack Cost Calculator aggregates the full monthly bill across model, embeddings, compute, and observability layers.

8. FAQ

Is Pinecone cheaper than pgvector at 200k vectors? Yes, on raw vendor cost ($0.68/mo vs $35/mo for managed Postgres). But if you already run Postgres, the pgvector marginal cost is near zero, and the comparison flips.

Why does retention window matter so much for embeddings cost? Storage scales linearly with retained vector count, and storage dominates cost at low QPS. Cutting retention from 365 to 60 days reduces stored vectors by roughly 6x.

Should a solo founder choose pgvector or a managed vector DB? If Postgres is already in the stack, pgvector. If not, Turbopuffer or Pinecone serverless at this scale. Adding Postgres just for vectors is the most expensive option.

How accurate are these cost estimates over time? Vendor pricing changes every quarter at this layer. The numbers are from April 2026 list pricing — rerun the calculator before any migration decision.

References

Sources

Primary sources only. No vendor-marketing blogs or aggregated secondary claims.

  1. 1 Pinecone — Pricing (Standard tier list rates for reads, writes, storage) — accessed 2026-05-21
  2. 2 Supabase — Pricing (Pro tier with included pgvector compute and storage) — accessed 2026-05-21
  3. 3 OpenAI — Embeddings API pricing (text-embedding-3-small and -3-large) — accessed 2026-05-21
  4. 4 Turbopuffer — Pricing (per-GB and per-million-operation list rates) — accessed 2026-05-21
  5. 5 AI Biz Hub — Embeddings DB Cost methodology — accessed 2026-05-21

Tools referenced in this article

Related articles

Business planning estimates — not legal, tax, or accounting advice.