turbopuffer is designed to be performant by default, but there are ways to optimize performance further. These suggestions aren't requirements for good performance--rather, they highlight opportunities for improvement when you have the flexibility to choose.
For example, while a single namespace with 100M documents works fine, splitting it into 10 namespaces of 10M documents each may yield better query performance if there's a natural way to group the documents.
Turbopuffer client instance for as
many requests as possible. This uses a connection pool behind the scenes to
avoid the overhead of a TCP and TLS handshake on every request.int8 output
matches f32 precision (benchmarks), so you can pass int8
values directly as JSON integers to an f16 namespace for f16 speed with no
precision loss.Glob tpuf* is compiled down to an optimized prefix
scan, whereas Glob *tpuf* or IGlob will potentially scan at every document
in the namespace. Contact us if you're seeing performance issues
for your workload, we can likely suggest alternatives (e.g. using full-text
search or a different filter). This is not a fundamental limitations, and we
plan to introduce indexes for these types of queries soon.file_id). At query time, do a
vector search on chunks, then look up the metadata using the unique IDs from
your results. This way, patches to chunk-specific attributes never touch the
large metadata.