turbopuffer is configurable by modifying a Kubernetes ConfigMap in the
turbopuffer
namespace of your deployment.
The turbopuffer team works with you to manage your deployment, e.g. propose ConfigMap changes to your cluster, e.g. tuning cache sizes, LSM settings, or recall.
To update the ConfigMap, you can use the Helm chart with the
values.yaml
you maintain for the cluster:
Change values.yaml
in the byoc-kit
directory and run the following:
helm upgrade -n default turbopuffer oci://us-central1-docker.pkg.dev/turbopuffer-onprem/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml
.
helm registry login us-central1-docker.pkg.dev
helm upgrade -n default turbopuffer oci://961341552108.dkr.ecr.us-west-2.amazonaws.com/turbopuffer/turbopuffer/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml
.
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 961341552108.dkr.ecr.us-west-2.amazonaws.com
helm upgrade -n default turbopuffer oci://turbopuffer.azurecr.io/turbopuffer/turbopuffer/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml
.
az acr login --name turbopuffer
after you login into the turbopuffer tenant by running az login --tenant 398cc17e-41b3-44de-929a-dc4048da9592
The following are configurations settings under kubernetes
:
An array of the different availability zones to be used in AWS for the query and index nodes.
Note: Settings this to multiple availability zones will incur an additional networking charge from AWS.
ec2_preferred_zone:
query: ["us-west-2a", "us-west-2b"]
index: ["us-west-2a", "us-west-2b"]
Tags to be added to the ec2 instances in the cluster.
ec2_instance_tags:
my-tag: my-value
The following are configurations settings under ingress
:
If true, the turbopuffer ingress will be exposed on an internal IP.
ingress:
internal: false
Configures how certificates will be handled in the cluster.
manual
- if using, you must also set manual.secretName:
to the name of the secret containing the TLS certdisabled
- needed if using Google Managed Certificates or if you wish to not use TLSletsencrypt
aws
- use an AWS managed certificate. You must also set aws.certificate_arn:
selfsigned
- use a self-signed certificatecertificates:
mode: 'letsencrypt'
The following are configurations settings under tpuf_config
:
A mapping of org ids to API keys. Each API key is expected by be a 44 character base 64 encoded SHA-256 key.
Your BYOC Kit includes a apikey.py
script which can generate valid org id and API key pairs.
Note: Currently all BYOC keys are generated as admin keys for their organization. To partition your data securely we recommend creating multiple organizations.
authentication:
allowed_api_keys_sha256:
"5X8OlKguH1l2jvTJrPgnvlcM": # Org ID
- "IaG0JUcIiCXKwqhIWH8Qr0incF2xsbRZRRJJxznl0GM=" # SHA-256 + Base64 API key
Maximum concurrent queries to a single namespace allowed. This protects the node against a single namespace being overloaded. 429s will be returned from queries if there is not enough capacity to handle them.
fairness:
query_concurrency_per_namespace: 16 # default
Maximum milliseconds to wait if the query concurrency limit is reached.
fairness:
query_bulkhead_wait_ms: 800 # default
Maximum number of documents that can be requested in a single query via the top_k
parameter.
search:
max_topk: 1200 # default
A set of org_ids to keep warm in cache. On node startup, machines will prewarm namespaces for these orgs to ensure their cache is hot.
Not recommended for most users.
cache:
prewarm:
keep_warm_orgs:
- '<premium-users-org>'
- '<no-cold-starts-pls-org>'
The absolute number of bytes or percentage of local SSD capacity to use as a cache.
Not recommended changing for most users.
cache:
disk_budget_bytes: 0.95 # default, leaves headroom for the filesystem
Number of cache fills to allow concurrently in the background per node. These are fired after a a cold query.
We prioritize cache fills for more important files (i.e. to get faster queries sooner), e.g. centroids.
indexing:
cache_fill_concurrency: 2 # default
The maximum number of unindexed bytes allowed in the WAL before a reindex is triggered.
indexing:
reindex_unindexed_bytes_max: 64000000 # default
The maximum number of rows we'll allow to remain unindexed. If the namespace has at least this many unindexed rows, a /index call will always trigger an index operation.
indexing:
reindex_unindexed_rows_max: 50000 # default
The maximum number of unindexed WAL entries allowed before a reindex is triggered.
indexing:
reindex_unindexed_wal_entries: 512 # default
During indexing, the number of document bytes to process at a given time before flushing. An indexing run can be composed of multiple batches, where we flush our progress incrementally after each bach.
indexing:
batch_size_bytes: 1250000000 # 1.25 GB, default
The OTLP endpoint to emit traces to, if any. Should end with /v1/traces
. If empty, traces
won't be emitted.
tracing:
otlp_endpoint: "http://localhost:4318/v1/traces"
A statsd endpoint to emit metrics to.
If present, all three subfields are required.
stats_export:
prefix: "foocorp.turbopuffer" # do not include trailing dot
host: "foocorp-statsd"
port: "8125"
The maximum number of concurrent requests in flight to object storage at one given time.
tpuf_config:
blob:
max_concurrent_requests: 2000 # default
The amount of time data can live in the LSM tree before being force-compacted.
This setting serves two purposes:
storage:
lsm_ttl_seconds: 1728000 # 20 days, default