Query prices reduced by up to 94%

Cursor scales code retrieval to 1T+ vectors with turbopuffer

Cursor switched from a vector database with a traditional storage architecture to turbopuffer, cutting costs by 20x and unlocking the ability to scale effortlessly as they grow.

95%

cost reduction

1T+

documents

10GB/s

write peaks

80M+

namespaces

turbopuffer is one of the few pieces of infrastructure we haven’t had to worry about as we’ve scaled. Since day one, turbopuffer has felt like they were part of our team.

Sualeh Asif

Sualeh Asif, Co-founder and CTO

Why turbopuffer?

turbopuffer’s serverless architecture with unlimited namespaces was a natural fit for Cursor’s namespace-per-codebase use case. Active codebase namespaces are loaded into the memory/NVMe cache, and inactive codebase indexes fade into object storage.

This cold/warm set of tradeoffs was a perfect fit for Cursor to realize dramatic cost reduction without degrading performance. Unlimited namespaces mean the Cursor infrastructure team no longer has to manually balance codebase indexes to servers.

turbopuffer in Cursor

turbopuffer powers code retrieval features in Cursor to populate the context window with code searches when appropriate. In the example below, Cursor draws on turbopuffer to semantically search across the codebase.

Codebases opened in Cursor are chunked and embedded with Cursor’s own embedding model. Each codebase instance is a separate namespace in turbopuffer, and Cursor employs various mechanisms like copy_from_namespace to recycle embedding vectors between namespaces.

Log

Cursor evaluated the impact of semantic search on agent code retrieval performance. Using semantic search in addition to grep, the same agent was able to achieve up to 23.5% better accuracy in answering questions versus an agent that used grep alone.

Relative improvement by model              
(Cursor Context Bench)                     
                                           
23.5%

 ▓▓▓                                      
 ▓▓▓                                      
 ▓▓▓
 ▓▓▓                14.7%                     
 ▓▓▓           11.9%                     
 ▓▓▓                 ▓▓▓                  
 ▓▓▓  8.7%      ▓▓▓  ▓▓▓                  
 ▓▓▓       6.5% ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓       ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓                  
────────────────────────                  
  A    B    C    D    E                   
                                          
  Models                                  
  A - Composer                            
  B - Gemini 2.5 Pro                      
  C - GPT-5                               
  D - Grok Code                           
  E - Sonnet 4.5                          
                                                                                  
source:
cursor.com/blog/semsearch

Cursor ran A/B tests to measure how much agent-generated code was retained by end users when the agent used semantic search versus just grep. They also measured how much code required follow-ups or corrections. They found that semantic search on turbopuffer increased code retention by 2.6% on large codebases and decreased user dissatisfaction by 2.2%.

Semantic search improves
code retention & reduces
dissatisfied user requests


     +2.6%
    
      ▓▓▓
      ▓▓▓
      ▓▓▓
      ▓▓▓
+0.3% ▓▓▓
      ▓▓▓
 ▓▓▓  ▓▓▓
──────────────────
  A    B    C

           ▓▓▓
           ▓▓▓
           ▓▓▓
           ▓▓▓
           ▓▓▓
           ▓▓▓
          -2.2%

A - Code Retention
B - Code Retention (large codebases)
C - Dissatisfied User Requests

source:
cursor.com/blog/semsearch

Cursor published how it securely reuses codebase indexes. Most checkouts of the same repo are nearly identical. It could be another engineer’s clone or a branch that hasn’t diverged much from the original, so re-embedding every chunk from zero is slow and redundant. Cursor fingerprints the tree, finds an existing index that’s similar enough, and seeds the new namespace with turbopuffer’s copy_from_namespace at a 50% write discount. Merkle-tree proofs ensure users only get hits for files their machine can actually read.

Time-to-first-query for users after implementing namespace copies improved drastically:

  • Median repo: 7.87s → 525ms
  • 90th percentile: 2.82 min → 1.87s
  • 99th percentile: 4.03 hours → 21s

We will continue to update this log as Cursor's usage of turbopuffer evolves.