1. Scope
Computes monthly cost across four vector DB options (Pinecone Standard, Postgres+pgvector managed, LanceDB on R2, Turbopuffer) using vendor public pricing snapshots as of 2026-04.
2. Inputs and outputs
Inputs
- vectorCount number
- dim number
- queriesPerDay number
- ingestPerDay number
- retentionDays number
Outputs
- vendors
Per-vendor monthly cost in USD with brief sourcing notes.
- cheapestVendor
Vendor name with the lowest monthlyCost.
- storageGb
vectors × dim × 4 bytes × 1.25 / 1 GB.
Engine source: src/lib/embeddings-db-cost/engine.ts
3. Formula / scoring logic
storage_bytes = vectors × dim × 4 × 1.25
queries_mo = queries_day × 30
ingest_mo = ingest_day × 30
then per-vendor list pricing applied 4. Assumptions
- Float32 storage. Quantized indices (int8, binary) cut storage 2–32× and aren't modeled.
- Replication / backup costs excluded.
- LanceDB self-hosted compute (CPU/RAM) excluded — only object-storage and op costs are in.
5. Data sources
- Pinecone pricing as of 2026-04
- Cloudflare R2 pricing as of 2026-04
- DigitalOcean Managed Databases pricing as of 2026-04
- Turbopuffer pricing as of 2026-04
6. Known limitations
- Vendor pricing changes; re-check if the as-of date has aged 60+ days.
- Workload-fit factors (latency, recall, namespacing) are not modeled — pure cost only.
7. Reproducibility
Input
1M vectors, dim 1536, 50k queries/day, 5k ingest/day, 365 day retention.
Expected output
storageGb ≈ 7.15. Cheapest vendor varies by inputs — see the live tool.
8. Change log
- 2026-05-08 methodology first published. Pricing snapshot 2026-04.
Worked example
Run live against the same engine this site ships
(/engines/embeddings-db-cost.js).
The inputs and outputs below are recomputed on every build and
independently re-verified in CI — they are never hand-authored.
Input
- tool
- embeddings_db_cost
- vector_count
- 1000000
- dim
- 1536
- queries_per_day
- 50000
- ingest_per_day
- 5000
- retention_days
- 365
Output
- vendors[0].vendor
- Pinecone
- vendors[0].monthlyCost
- 3.46
- vendors[0].notes
- Pinecone Standard list pricing 2026-04: $0.33/M reads, $4/M writes, $0.33/GB-mo storage.
- vendors[1].vendor
- Postgres+pgvector
- vendors[1].monthlyCost
- 35
- vendors[1].notes
- DigitalOcean managed Postgres baseline ($35/mo, includes 25GB; $0.20/GB-mo overage). Self-hosted equivalent.
- vendors[2].vendor
- LanceDB
- vendors[2].monthlyCost
- 7.53
- vendors[2].notes
- LanceDB on Cloudflare R2 list pricing 2026-04: $0.015/GB-mo, $4.50/M ops. Self-hosted compute not included.
- vendors[3].vendor
- Turbopuffer
- vendors[3].monthlyCost
- 1.08
- vendors[3].notes
- Turbopuffer list pricing 2026-04: $0.10/GB-mo, $0.04/M reads, $2/M writes.
- cheapestVendor
- Turbopuffer
- cheapestMonthlyCost
- 1.08
- storageGb
- 7.15
Frequently asked questions
- What does the Embeddings DB Cost calculate?
- Computes monthly cost across four vector DB options (Pinecone Standard, Postgres+pgvector managed, LanceDB on R2, Turbopuffer) using vendor public pricing snapshots as of 2026-04.
- What inputs does the Embeddings DB Cost need?
- It takes 5 inputs: vectorCount, dim, queriesPerDay, ingestPerDay, retentionDays. Outputs returned: vendors, cheapestVendor, storageGb.
- What formula does the Embeddings DB Cost use?
- The exact computation is: storage_bytes = vectors × dim × 4 × 1.25; queries_mo = queries_day × 30; ingest_mo = ingest_day × 30; then per-vendor list pricing applied
- Can I verify the Embeddings DB Cost with a worked example?
- Yes. With 1M vectors, dim 1536, 50k queries/day, 5k ingest/day, 365 day retention. the tool returns storageGb ≈ 7.15. Cheapest vendor varies by inputs — see the live tool.
- Where does the Embeddings DB Cost get its benchmark data?
- Reference data is sourced from: Pinecone pricing (as of 2026-04); Cloudflare R2 pricing (as of 2026-04); DigitalOcean Managed Databases pricing (as of 2026-04); Turbopuffer pricing (as of 2026-04).
- What can the Embeddings DB Cost not tell me?
- Known limitations: Vendor pricing changes; re-check if the as-of date has aged 60+ days. Workload-fit factors (latency, recall, namespacing) are not modeled — pure cost only.