Atlassian finds its multi-cloud BYOC search engine in turbopuffer
Atlassian chose turbopuffer for its operational and architectural simplicity, cloud-agnostic BYOC deployment model, low cost, and virtually unlimited scalability. It now underpins search for millions of Atlassian customers on Jira, Confluence, and more.
19%
search quality increase
60ms
p90 latency
billions
of documents
96%
recall@10
turbopuffer gives us the foundation to build profitable AI. We believe in the architecture and we believe in the team.
Fei Teng, Senior Engineering Manager, Search Platform
Fei Teng is the Engineering Manager of the Search Platform team behind Rovo, Atlassian's cross-product AI platform for Search, Chat, and Agents.
Rovo unifies search across all Atlassian products, providing primitives for users and agents to retrieve relevant documents across Jira, Confluence, and other Atlassian products and integrations.
Prior to Rovo, search in Atlassian was fragmented across different products. Their flagships, Jira and Confluence, implemented a basic keyword search using an open source search engine with limited ranking capabilities.
With Rovo, Atlassian aimed to build a universal retrieval layer to search for relevant context across all of Atlassian's product surfaces and third-party data sources. To power this, they looked for a modern, cloud-agnostic semantic search engine that could scale to support their 350,000+ customers.
Fei's team evaluated many semantic search engines, including self-hosted open source and commercial databases. An initial performance benchmark narrowed the options to the two databases that hit latency and recall targets: turbopuffer and one other commercial database.
Why turbopuffer?
Fei and his team ended up going with turbopuffer for its operational simplicity, cloud-agnostic Bring Your Own Cloud (BYOC) deployment model, low cost, and virtually unlimited scalability. turbopuffer gave Fei the confidence that his relatively small team of engineers could profitably scale Rovo's retrieval layer to support millions of Atlassian customers.
Simplicity
turbopuffer's architecture is simple; Fei finds the API easy to understand and cluster management more intuitive than other commercial providers. They deploy turbopuffer via BYOC in their AWS and GCP environments without worrying much about capacity planning, as turbopuffer gives them the flexibility to build search indexes as needed for their massive multi-tenant use cases.
They primarily use a namespace-per-tenant architecture leveraging turbopuffer's hot/cold model, where active tenant indexes are loaded into an SSD + RAM cache and inactive indexes are kept in object storage. On top of great $/TB economics, the design provides virtually infinite scaleout, as each tenant's namespace is just another prefix on object storage.
Fundamentally, this is an architecture we love. It scales out easily, it isolates customer data by default, and performance is very predictable.
Fei Teng, Senior Engineering Manager, Search Platform
BYOC on multi-cloud
In Fei's words, Atlassian "runs on trust." Atlassian products like Jira and Confluence hold sensitive enterprise data – so security, privacy, and compliance are closely scrutinized during procurement.
turbopuffer's BYOC deployment option with customer managed encryption keys (CMEK) and SOC2 and HIPAA certifications quickly satisfied Atlassian's core compliance requirements. The architecture doesn't just minimize stateful dependencies, it also keeps the data subprocessor list very small (just Google and AWS!), so Atlassian's security audit followed a familiar, routine path.
Support
turbopuffer's engineers have worked closely with Atlassian engineers on two fronts:
- Consulting on cluster configuration and search parameter tuning to help Atlassian get the most out of their deployment.
- Shipping algorithm-level changes and cost optimizations on turbopuffer's side to boost Atlassian's recall and reduce their latency.
Since moving turbopuffer into production, the combined effort has brought Atlassian's p90 semantic search latency from 125ms → 60ms while increasing average recall from 90% → 96%+.
Results
With turbopuffer, Fei's team has improved historically hard-to-move search quality metrics, such as Jira long-click rate (the rate at which the first clicked result satisfies the search intent), while improving recall and performance searching billions of documents.
- 19% increase in Jira long-click rate
- Flexible BYOC deployments
- 60ms p90 latency
- 96% average recall@10
- billions of documents indexed
turbopuffer in Atlassian
Atlassian indexes Jira tickets, Confluence wikis, Google Drive documents, and associated metadata – billions of documents across millions of namespaces. They build search APIs on top of turbopuffer that are used in both user-facing search (e.g. Cmd+K in Confluence) and as tools for agents to retrieve context during Rovo chat sessions.
Each customer's data is indexed on a separate turbopuffer namespace. Actively searched namespaces are cached for sub-20ms queries; inactive namespaces maintain their indexes on low-cost object storage without accruing idle compute capacity costs. Atlassian's p90 latency across all queries, hot and cold, is ~60ms.
What's next
- Unified hybrid retrieval: Atlassian is experimenting with using turbopuffer to combine semantic and full-text search into a unified hybrid retrieval layer to be able to handle more complex searches across different use cases.
- Third-party connectors: Fei's team wants to index data from more Atlassian products and a growing ecosystem of third-party connectors, delivering on Rovo's promise as a unified context layer for search within Atlassian.
- Enterprise scaleout: More isolated, multi-cloud BYOC deployments with CMEK for Atlassian's largest enterprise customers.
We will continue to update this log as Atlassian's turbopuffer usage evolves.