# BYOC Deployment Runlist

For each cluster in your turbopuffer [BYOC](/docs/byoc) deployment you will be provided with a 'BYOC kit' containing all the files required to configure your cluster. This document provides guidance to successfully deploy a new turbopuffer BYOC cluster.

## Your kit contents

```txt
byoc-kit
├── README.md
├── aws
│   ├── main.tf
│   └── turbopuffer.tfvars
├── cosign.pub
├── gcp
│   ├── main.tf
│   └── turbopuffer.tfvars
├── azure
│   ├── main.tf
│   └── turbopuffer.tfvars
├── scripts
│   ├── generate_secrets.py
│   └── sanity.sh
├── values.yaml (generated) # configuration file generated by terraform
├── values.secret.yaml (generated) # sensitive configuration file generated by terraform
├── compute_classes.yaml (generated, GCP only) # GKE ComputeClass manifest generated by terraform
└── metrics-keys.yaml # configuration file provided by turbopuffer
```

## Runlist

0. [ ] **Mise en place:** Check you have all prerequisites:
    - [ ] Verify you have both `terraform`, `kubectl` and `helm` installed.
    - [ ] Provision a **fresh sub-account / project** for your new cluster.
    - [ ] Enable pulling your image from our registries: **GCP:** Provide the **service account email** used for pulling images to the turbopuffer team. Either the default compute service account for the sub-account, or a custom service account e.g. used for replicating images into your own registry or [configured in K8s](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#use-multiple-service-accounts).
          **AWS:** Provide us with your **AWS account id**.
          **Azure:** Provide the **multi-tenant application (client) id** that [will be used for your azure clusters](https://learn.microsoft.com/en-us/azure/container-registry/authenticate-aks-cross-tenant#pull-images-from-a-container-registry-to-an-aks-cluster-in-a-different-microsoft-entra-tenant).
3. [ ] **Cluster configuration:** Apply terraform configuration to setup Kubernetes cluster and bucket
    - [ ] **GCP:** `cd gcp`
          **AWS:** `cd aws`
          **Azure:** `cd azure`
    - [ ] Run `terraform init` to setup required providers
    - [ ] Fill in the required values in `turbopuffer.tfvars`**GCP:** . The `query_skus`, `index_skus`, and `maintenance_machine_type` variables control which GCP machine types back the query, index, and maintenance node pools. The defaults work for most deployments.
    - [ ] Apply terraform configuration: `terraform apply -var-file=turbopuffer.tfvars`
4. [ ] **`kubectl`.** Add your new cluster context to kubectl.
    - [ ] Run **GCP:** `gcloud container clusters get-credentials CLUSTER_NAME --project PROJECT_ID --region REGION`
          **AWS:** `aws eks update-kubeconfig --region REGION --name CLUSTER_NAME`
          **Azure:** `az aks get-credentials --name=CLUSTER_NAME --resource-group=RESOURCE_GROUP`
    - [ ] Run `kubectl config get-contexts` and confirm the cluster is correct.
    - [ ] Run `kubectl get pods` and confirm the command succeeds (no output).
    - [ ] **GCP:** **Apply ComputeClass manifests:** Terraform writes a `compute_classes.yaml` at the kit root that defines GKE `ComputeClass` priorities across the query and index node pools. Apply it before installing Helm so the autoscaler has the priority order in place when the first pods schedule: `kubectl apply -f compute_classes.yaml`
5. [ ] **Configure Helm:** The terraform command will have output a `values.yaml` file in the `byoc-kit` directory, which contains values for Helm. Edit this file and set any other necessary values. Refer to `values.schema.json` for a description of valid configurations.
    - To use provider managed TLS certificates, see [using cloud provider managed TLS certificates](#using-cloud-provider-managed-tls-certificates).
    - `tpuf_config` configuration values suggested by turbopuffer for your BYOC deployment. You can find information about these settings and more in our [BYOC configuration](/docs/byoc/configuration) documentation.
6. [ ] **Generate API keys:** Run `./scripts/generate-secrets.py` to generate `values.secret.yaml`. This file will generate an Org Id and API key, along with an token for intra-cluster communication. 
7. [ ] **Deploy turbopuffer:**
    - [ ] **Log in to Helm registry:** Run **GCP:** `helm registry login us-central1-docker.pkg.dev`
          **AWS:** `aws ecr get-login-password --region us-west-2 | helm registry login --username AWS --password-stdin 961341552108.dkr.ecr.us-west-2.amazonaws.com`
          **Azure:** `az login --tenant 398cc17e-41b3-44de-929a-dc4048da9592 && az acr login --name turbopuffer`
    - [ ] **Install the Helm chart:** Run **GCP:** `helm install -n default turbopuffer oci://us-central1-docker.pkg.dev/turbopuffer-onprem/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml`
          **AWS:** `helm install -n default turbopuffer oci://961341552108.dkr.ecr.us-west-2.amazonaws.com/turbopuffer/turbopuffer/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml`
          **Azure:** `helm install -n default turbopuffer oci://turbopuffer.azurecr.io/turbopuffer/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml`
    - [ ] **For subsequent updates**, run **GCP:** `helm upgrade -n default turbopuffer oci://us-central1-docker.pkg.dev/turbopuffer-onprem/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml`
          **AWS:** `helm upgrade -n default turbopuffer oci://961341552108.dkr.ecr.us-west-2.amazonaws.com/turbopuffer/turbopuffer/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml`
          **Azure:** `helm upgrade -n default turbopuffer oci://turbopuffer.azurecr.io/turbopuffer/charts/tpuf --values=values.yaml --values=values.secret.yaml --values=metrics-keys.yaml`
8. [ ] Run post-deployment sanity checks
    - [ ] `TURBOPUFFER_API_KEY=<your_api_key> scripts/sanity.sh` will query your turbopuffer cluster directly, verifying that core operations function. It will not verify certicates, and may encounter a 500 error if the nodes aren't routeable yet.

## Using a custom registry for your turbopuffer cluster

By default turbopuffer will pull from one of several turbopuffer managed image registries, as configured in our included terraform. 
However, there are many reasons you may want to host our images in a registry you control. 
Our Helm chart fully supports this through the following settings:

```yaml
image.registry: YOUR_REGISTRY_URL
control_plane.image.registry: YOUR_REGISTRY_URL
```

We expect two registries found there one called `turbopuffer` and one called `tpuf-ctl-cluster` holding the images for turbopuffer and our control plane agent respectively. 

For customers on AWS, we can configure ECR Replication to automatically push the latest images into your registry. 

## Using cloud provider managed TLS certificates 

Our helm chart allows managing TLS termination internally to your cluster using either `cert-manager` or the native kubernetes apis. 
Your organization may already be managing their certificates through your cloud providers' managed certificates offerings, in which case you will need to handle termination yourselves. 

Regardless of your cloud provider, you will want to deploy turbopuffer internally, by setting:

```yaml
ingress.internal: true
```

### GCP

To get started set the following in `values.yaml` and re-run `helm upgrade ...` as described in step 7.

```yaml
certicates.mode: disabled
```

Adding [Google Managed Certificates](https://cloud.google.com/kubernetes-engine/docs/how-to/managed-certs) to your GKE cluster is as simple as deploying the following Kubernetes manifest alongside your turbopuffer helm deployment. 
All that is required is to insert the correct value for `YOUR_DOMAIN`.

```yaml
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: ingress-nginx-svc-config
  namespace: ingress-nginx
spec:
  healthCheck:
    checkIntervalSec: 10
    timeoutSec: 10
    port: 80
    type: HTTP
    requestPath: /healthz

---
apiVersion: v1
kind: Service
metadata:
  name: ingress-nginx-svc
  namespace: ingress-nginx
  annotations:
    cloud.google.com/backend-config: '{"default": "ingress-nginx-svc-config"}'
spec:
  ports:
  - appProtocol: http
    name: http
    port: 80
    protocol: TCP
    targetPort: http
  - appProtocol: https
    name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
  type: ClusterIP
---
apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
  name: managed-cert
  namespace: ingress-nginx
spec:
  domains:
    - YOUR_DOMAIN
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-nginx-ing
  namespace: ingress-nginx
  annotations:
    networking.gke.io/managed-certificates: managed-cert
spec:
  ingressClassName: "gce"
  defaultBackend:
    service:
      name: ingress-nginx-svc
      port:
        number: 80
```
### AWS

You will need to provision your certificate externally to your cluster using the AWS console or CLI.

Then update your Helm `values.yaml` with the following configuration values and
re-run `helm upgrade ...` as described in step 7.

```yaml
ingress:
  certificates:
    mode: aws
    aws:
      certificate_arn: YOUR_CERTIFICATE_ARN
```

## Networking

If you want to disable outgoing connections for the cluster, you can allowlist
the following IPs:

+ Polar Signals (CPU and Heap profiling)
    - `35.234.93.182` (`api.polarsignals.com`)
+ Control Plane (Cluster Heartbeats)
    - `76.76.21.0/24`
+ Datadog (Telemetry)
    - `curl -s https://ip-ranges.datadoghq.com/ | jq -r '(.apm.prefixes_ipv4 + .global.prefixes_ipv4 + .logs.prefixes_ipv4 + .agents.prefixes_ipv4) | unique[]'`

## Upgrading turbopuffer versions

If you have manual approvals enabled, the turbopuffer team will provide you with a command to upgrade the cluster when a new version is available. 
Otherwise, upgrades will happen automatically through the control plane.


---

This page: [/docs/byoc/deployment.md](https://turbopuffer.com/docs/byoc/deployment.md)

All documentation pages: [/llms.txt](https://turbopuffer.com/llms.txt)

All documentation in one file: [/llms-full.txt](https://turbopuffer.com/llms-full.txt)