Cursor scales code retrieval to 1T+ vectors with turbopuffer

Cursor switched from a vector database with a traditional storage architecture to turbopuffer, cutting costs by 20x and unlocking the ability to scale effortlessly as they grow.

95%

cost reduction

1T+

documents

10GB/s

write peaks

80M+

namespaces

turbopuffer is one of the few pieces of infrastructure we haven’t had to worry about as we’ve scaled. Since day one, turbopuffer has felt like they were part of our team.

Sualeh Asif, Co-founder and CTO

Why turbopuffer?

turbopuffer’s serverless architecture with unlimited namespaces was a natural fit for Cursor’s namespace-per-codebase use case. Active codebase namespaces are loaded into the memory/NVMe cache, and inactive codebase indexes fade into object storage.

This cold/warm set of tradeoffs was a perfect fit for Cursor to realize dramatic cost reduction without degrading performance. Unlimited namespaces mean the Cursor infrastructure team no longer has to manually balance codebase indexes to servers.

turbopuffer in Cursor

turbopuffer powers code retrieval features in Cursor to populate the context window with code searches when appropriate. In the example below, Cursor draws on turbopuffer to semantically search across the codebase.

Codebases opened in Cursor are chunked and embedded with Cursor’s own embedding model. Each codebase instance is a separate namespace in turbopuffer, and Cursor employs various mechanisms like copy_from_namespace to recycle embedding vectors between namespaces.

Log

November 2023

Cursor migrated to turbopuffer in November 2023 and experienced:

20x cost reduction for semantic search
Unlimited namespaces in a fully serverless model; no more bin-packing codebase vector indexes to servers
Peak ingestion of 1M+ writes/second, for faster migrations and indexing

After switching our vector db to @turbopuffer, we're saving an order of magnitude in costs and dealing with far less complexity!

Aman Sanger, Co-founder

November 2025

Cursor evaluated the impact of semantic search on agent code retrieval performance. Using semantic search in addition to grep, the same agent was able to achieve up to 23.5% better accuracy in answering questions versus an agent that used grep alone.

All models improve with semantic search


Model            Relative Improvement (Cursor Context Bench)
──────────────   ────────────────────────────────────────────────────── 

Composer         │▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 23.5%
                 │
Gemini 2.5 Pro   │▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 8.7%  
                 │
GPT-5            │▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 6.5%
                 │
Grok Code        │▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 11.9%  
                 │
Sonnet 4.5       │▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 14.7% 


source: cursor.com/blog/semsearch

Relative improvement by model              
(Cursor Context Bench)                     
                                           
23.5%

 ▓▓▓                                      
 ▓▓▓                                      
 ▓▓▓
 ▓▓▓                14.7%                     
 ▓▓▓           11.9%                     
 ▓▓▓                 ▓▓▓                  
 ▓▓▓  8.7%      ▓▓▓  ▓▓▓                  
 ▓▓▓       6.5% ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓       ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓                  
 ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓  ▓▓▓                  
────────────────────────                  
  A    B    C    D    E                   
                                          
  Models                                  
  A - Composer                            
  B - Gemini 2.5 Pro                      
  C - GPT-5                               
  D - Grok Code                           
  E - Sonnet 4.5                          
                                                                                  
source:
cursor.com/blog/semsearch

Cursor ran A/B tests to measure how much agent-generated code was retained by end users when the agent used semantic search versus just grep. They also measured how much code required follow-ups or corrections. They found that semantic search on turbopuffer increased code retention by 2.6% on large codebases and decreased user dissatisfaction by 2.2%.

Semantic search improves code retention 
and reduces dissatisfied user requests


Code Retention                            │▓ +0.3%
                                          │  
Code Retention (large codebases)          │▓▓▓▓▓▓▓▓▓ +2.6%
                                          │
Dissatisfied User Requests  -2.2% ▓▓▓▓▓▓▓▓│



source: cursor.com/blog/semsearch

Semantic search improves
code retention & reduces
dissatisfied user requests


     +2.6%
    
      ▓▓▓
      ▓▓▓
      ▓▓▓
      ▓▓▓
+0.3% ▓▓▓
      ▓▓▓
 ▓▓▓  ▓▓▓
──────────────────
  A    B    C

           ▓▓▓
           ▓▓▓
           ▓▓▓
           ▓▓▓
           ▓▓▓
           ▓▓▓
          -2.2%

A - Code Retention
B - Code Retention (large codebases)
C - Dissatisfied User Requests

source:
cursor.com/blog/semsearch

January 2026

Cursor published how it securely reuses codebase indexes. Most checkouts of the same repo are nearly identical. It could be another engineer’s clone or a branch that hasn’t diverged much from the original, so re-embedding every chunk from zero is slow and redundant. Cursor fingerprints the tree, finds an existing index that’s similar enough, and seeds the new namespace with turbopuffer’s copy_from_namespace at a 50% write discount. Merkle-tree proofs ensure users only get hits for files their machine can actually read.

Time-to-first-query for users after implementing namespace copies improved drastically:

Median repo: 7.87s → 525ms
90th percentile: 2.82 min → 1.87s
99th percentile: 4.03 hours → 21s

We will continue to update this log as Cursor's usage of turbopuffer evolves.

Cursor scales code retrieval to 1T+ vectors with turbopuffer

Why turbopuffer?

turbopuffer in Cursor

Log

Guides

API Docs