Atlassian finds its multi-cloud BYOC search engine in turbopuffer
Atlassian chose turbopuffer for its operational and architectural simplicity, cloud-agnostic BYOC deployment model, low cost, and virtually unlimited scalability. It now underpins search for millions of Atlassian customers on Jira, Confluence, and more.
19%
search quality increase
60ms
p90 latency
5B+
documents
96%
recall@10
turbopuffer gives us the foundation to build profitable AI. We believe in the architecture and we believe in the team.
Fei Teng, Senior Engineering Manager, Search Platform
Fei Teng is the Engineering Manager of the Search Platform team behind Rovo, Atlassian's cross-product AI platform for Search, Chat, and Agents.
Rovo unifies search across all Atlassian products, providing primitives for users and agents to retrieve relevant documents across Jira, Confluence, and other Atlassian products and integrations.
Prior to Rovo, search in Atlassian was fragmented across different products. Their flagships, Jira and Confluence, had a basic keyword search implemented in OpenSearch, but the lack of semantic retrieval left relevant documents out of results.
The team first built a semantic search engine indexing embeddings on OpenSearch. But OpenSearch underperformed on a combination of recall, latency, and cost. They also disliked its convoluted config syntax, which made it hard to quickly debug and fine-tune performance issues.
The AWS OpenSearch service also failed to provide the cloud-agnostic model that Atlassian needed to run isolated deployments for enterprise accounts on both AWS and GCP.
Fei's team evaluated many modern semantic search engines, including self-hosted open source and commercial databases. An initial performance benchmark narrowed the options to the two databases that hit latency and recall targets: turbopuffer and one other commercial database.
Why turbopuffer?
Fei and his team ended up going with turbopuffer for its operational simplicity, cloud-agnostic Bring Your Own Cloud (BYOC) deployment model, low cost, and virtually unlimited scalability. turbopuffer gave Fei the confidence that his relatively small team of engineers could profitably scale Rovo's retrieval layer to support millions of Atlassian customers.
Simplicity
turbopuffer's architecture is simple; Fei finds the API much easier to understand than OpenSearch and cluster management much more intuitive than other commercial providers. They deploy turbopuffer via BYOC in their AWS and GCP environments without capacity planning or reasoning about how to shard indexes across nodes.
They use a namespace-per-tenant architecture leveraging turbopuffer's hot/cold model, where active tenant indexes are loaded into an SSD + RAM cache and inactive indexes are kept in object storage. On top of great $/TB economics, the design provides virtually infinite scaleout, as each tenant's namespace is just another prefix on object storage.
Fundamentally, this is an architecture we love. It scales out easily, it isolates customer data by default, and performance is very predictable.
Fei Teng, Senior Engineering Manager, Search Platform
BYOC on multi-cloud
In Fei's words, Atlassian "runs on trust." Atlassian products like Jira and Confluence hold sensitive enterprise data – so security, privacy, and compliance are closely scrutinized during procurement.
turbopuffer's BYOC deployment option with customer managed encryption keys (CMEK) and SOC2 and HIPAA certifications quickly satisfied Atlassian's core compliance requirements. The architecture doesn't just minimize stateful dependencies, it also keeps the data subprocessor list very small (just Google and AWS!), so Atlassian's security audit followed a familiar, routine path.
Support
turbopuffer's engineers have worked closely with Atlassian engineers on two fronts:
- Consulting on cluster configuration and search parameter tuning to help Atlassian get the most out of their deployment.
- Shipping algorithm-level changes and cost optimizations on turbopuffer's side to boost Atlassian's recall and reduce their latency.
Since moving turbopuffer into production, the combined effort has brought Atlassian's p90 semantic search latency from 125ms → 60ms while increasing average recall from 90% → 96%+.
Results
With turbopuffer, Fei's team has improved historically hard-to-move search quality metrics, such as Jira long-click rate (the rate at which the first clicked result satisfies the search intent), while improving recall and performance searching billions of documents.
- 19% increase in Jira long-click rate
- Flexible BYOC deployments
- 60ms p90 latency
- 96% average recall@10
- 5B+ documents indexed
turbopuffer in Atlassian
Atlassian indexes Jira tickets, Confluence wikis, Google Drive documents, and associated metadata – over 5 billion documents across millions of namespaces. They build search APIs on top of turbopuffer that are used in both user-facing search (e.g. Cmd+K in Confluence) and as tools for agents to retrieve context during Rovo chat sessions.
Each customer's data is indexed on a separate turbopuffer namespace. Actively searched namespaces are cached for sub-20ms queries; inactive namespaces maintain their indexes on low-cost object storage without accruing idle compute capacity costs. Atlassian's p90 latency across all queries, hot and cold, is ~60ms.
What's next
- Lexical search consolidation: Atlassian is engaged in an active engineering spike to consolidate both lexical and semantic search on turbopuffer and eliminate their OpenSearch dependency
- Third-party connectors: Fei's team wants to index data from more Atlassian products and a growing ecosystem of third-party connectors, providing a unified context layer for search within Atlassian
- Enterprise scaleout: More isolated, multi-cloud BYOC deployments with CMEK for Atlassian's largest enterprise customers
We will continue to update this log as Atlassian's turbopuffer usage evolves.