Every technology has tradeoffs. This document outlines turbopuffer's key design
choices to help inform your evaluation:
- High latency, high throughput writes. turbopuffer prioritizes simplicity, durability, and scalability by using object storage as a write-ahead log, keeping nodes stateless. While this means writes take up to 200ms to commit, the system supports thousands of writes per second per namespace. Despite this latency, our consistent read model makes documents visible to queries faster than eventually consistent search engines. This architecture choice enables our cost-effective scaling and is particularly well-suited for search workloads.
- Consistent reads have ~20ms latency floor. turbopuffer's reads are consistent by default, checking object storage for the latest updates even for cached namespaces. This 20ms baseline latency matches our object storage's
GET IF-NOT-MATCH
p50 and should improve as object storage technology advances. For workloads requiring sub-10ms latency, you can enable eventual consistency.
- Optimized for accuracy. turbopuffer delivers high recall out of the box,
maintaining this quality even with complex filters. We prioritize consistent,
accurate results over configurable performance optimizations.
- Scales to millions of namespaces. turbopuffer scales to trillions of
documents across hundreds of millions of namespaces. While you can create unlimited namespaces,
individual namespaces have ever-expanding size guidelines.
Namespacing your data means benefiting natural data partitioning (e.g. tenancy)
for performance and cost.
- Focused on first-stage retrieval. turbopuffer focuses on efficient first-stage retrieval, providing a simple API to filter millions of documents down to a manageable set. You can then refine and rerank results using familiar programming languages like Python or TypeScript, making your search logic easier to develop and maintain. Learn more about this approach in our Hybrid Search guide. We've found that it's difficult to maintain search applications in mountains of idiosyncratic query language.
- Focused on paid customers. We've chosen a commercial-only model to
maintain high-quality support and rapid development. While we don't offer a
free tier or open source version, you can run turbopuffer in your own cloud--contact us for details.
For more details, see Guarantees, Limits,
and Architecture pages.