Warning: Queries may be slow during periods of high write throughput or after a large bulk import.
turbopuffer can handle >= 10,000 writes/s (WPS) per namespace, but indexing cannot currently keep up. This causes high query latency while performing bulk imports. When write throughput decreases (<= 100 per second) the indexer catches up, and queries will be fast.
Most use-cases do an initial bulk import, followed by queries with lower write throughput (<= 100 per second). For this use-case, it's not a problem. We are actively working to improve this limitation.
There isn't a limit or performance metric we don't think we can improve by an order of magnitude when prioritized! If you expect to brush up against a limit or are limited by performance by an operation, contact us. Often can be fixed in days.
turbopuffer excels at | turbopuffer may not currently be the best fit for |
---|---|
Large scale (1B+ vectors) with lots of namespaces (millions is fine) | Large namespaces (30M+, in progress, they work, but a bit slow) |
Naturally sharded data (e.g. B2B where each tenant's data is isolated in its own namespace) | Extremely high query loads on a single namespace, especially larger namespaces (300 QPS+) |
Not every namespace is active concurrently | Aggregation (e.g. group by, sums, explore clusters, ...) |
Fast cold starts | Single-digit millisecond latency (tpuf is currently low double digits) |
Reliability | Extensive 1st-stage ranking (we encourage generating a candidate set with hybrid search and refining/re-ranking further in your own 2nd stage) |
Hybrid search (BM25 + vector search) | Built-in re-ranking (we encourage you to do it in your own application) |
Cost-effectiveness | Built-in embedding (ditto) |
Self-hosting | |
Low scale, free tier |
Metric | Max seen in production | Production limits (current) | Production limits (soon) |
---|---|---|---|
Max documents (global) | 20B+ | ||
Max documents (per namespace) | 100M+ | ||
Number of namespaces | 7M+ | ||
Max dimensions | 10,752 | ||
Max inactive time in cache | Contact us for custom | ||
Write rate (global) | |||
Write rate (per namespace) | Unlimited | ||
Max write batch rate (per namespace) | 1 batch/s | ||
QPS (global) | >2000 QPS | ||
Max QPS (per namespace) | ~20 QPS | 100+ QPS | 10,000 QPS |
~90-95% | ~90-95% | Configurable | |
Max attribute value | 64 KiB | 1 MiB | |
Max attribute name length | 128 | 128 | 128 |
Max attributes per document | 256 | 256 | 256 |
Max namespace name length | 128 | 128 | 128 |