Cross-Region Backups
┌─aws-us-east-1 (source)──────┐
│ ┌───────────────────────┐ │░
│ │ my-namespace │ │░
│ └───────────────────────┘ │░
└─────────────────────────────┘░
░░░░░░░░░░░░░│░░░░░░░░░░░░░░░░░
▼
┌─aws-us-west-2 (dest)────────┐
│ ┌───────────────────────┐ │░
│ │ my-namespace-copy │ │░
│ └───────────────────────┘ │░
└─────────────┬───────────────┘░
░░░░░░░░░░░░░│░░░░░░░░░░░░░░░░░
│
──copy_from_namespaceturbopuffer supports efficient namespace copies across regions via
copy_from_namespace for geo-redundancy,
disaster recovery, and accidental deletion protection. We don't currently offer automated backups. Historically,
customers have rebuilt from their primary data source when needed, but
cross-region copies are now often a better option.
Branching provides constant-time namespace snapshots, but
shares underlying storage with the source namespace. Use copy_from_namespace
for full data isolation.
Copies are performed entirely server-side, so there's no data transfer through your infrastructure. They're billed at up to a 75% write discount and create fully writable namespaces you can use however you like. Cross-region copies also bill returned bytes for the logical size copied. Storage is billed at standard rates, but since you're not querying backup namespaces, they're cheap to keep around, making daily or weekly snapshots practical. Copies work across regions and across cloud providers (e.g., AWS to GCP).
CMEK encryption
To encrypt the backup with a customer managed encryption key (CMEK), specify an
encryption key in the encryption parameter.
The key must be available in the destination region.
Specifying an encryption key is mandatory if the source namespace has CMEK encryption enabled.
Running Backups on Schedule
To maintain up-to-date backups, run cross-region copies on a regular schedule. Here's an example script (run via cron or any scheduler) that backs up all namespaces matching a prefix. It appends the date to each backup namespace name and automatically cleans up backups older than 7 days:
# /// script
# requires-python = ">=3.10"
# dependencies = ["turbopuffer"]
# ///
import os
import time
import turbopuffer
# Configuration
SOURCE_REGION = "gcp-us-central1"
BACKUP_REGION = "gcp-us-west1"
SOURCE_PREFIX = "fts-" # Back up all namespaces starting with "fts-"
BACKUP_PREFIX = "backup-" # Backup namespaces will be "backup-{name}-{date}"
RETENTION_DAYS = 7
source_client = turbopuffer.Turbopuffer(
api_key=os.getenv("TURBOPUFFER_API_KEY"), region=SOURCE_REGION
)
backup_client = turbopuffer.Turbopuffer(
api_key=os.getenv("TURBOPUFFER_API_KEY"), region=BACKUP_REGION
)
timestamp = int(time.time()) # Unix epoch seconds
start_time = time.time()
# Step 1: Back up each namespace matching the source prefix
print("Starting backups...")
namespaces = list(source_client.namespaces(prefix=SOURCE_PREFIX))
for ns in namespaces:
backup_name = f"{BACKUP_PREFIX}{ns.id}-{timestamp:010d}"
print(f" Backing up: {ns.id}")
backup_ns = backup_client.namespace(backup_name)
backup_ns.copy_from(
source_namespace=ns.id,
source_region=SOURCE_REGION,
# if backing up to a different organization, include source_api_key:
# source_api_key="<source-org-api-key>",
)
# Step 2: Delete old backups beyond the retention period (after successful backup)
print("Cleaning up old backups...")
cutoff = int(time.time()) - RETENTION_DAYS * 86400
deleted = 0
for ns in backup_client.namespaces(prefix=BACKUP_PREFIX):
# Safety check: only delete namespaces that match our backup prefix
assert len(BACKUP_PREFIX) > 0 and ns.id.startswith(
BACKUP_PREFIX
), f"Refusing to delete namespace that doesn't match backup prefix: {ns.id}"
# Extract timestamp from backup namespace name (e.g., "backup-prod-users-1234567890")
if len(ns.id) >= 10:
try:
backup_time = int(ns.id[-10:])
if backup_time < cutoff:
print(f" Deleting: {ns.id}")
backup_client.namespace(ns.id).delete_all()
deleted += 1
except ValueError:
print(
f" Skipping {ns.id}: invalid timestamp format",
file=__import__("sys").stderr,
)
print(
f"Done: backed up {len(namespaces)} namespaces, deleted {deleted} old backups in {time.time() - start_time:.1f}s"
)See Limits for copy throughput estimates.
Recovering a Namespace
Backup namespaces are fully functional. You can either point your application to the namespace in the backup region directly, or copy it to a new namespace in your preferred region as shown below:
# /// script
# requires-python = ">=3.10"
# dependencies = ["turbopuffer"]
# ///
import os
import time
import turbopuffer
# Configuration
SOURCE_REGION = "gcp-us-central1"
BACKUP_REGION = "gcp-us-west1"
BACKUP_PREFIX = "backup-"
source_client = turbopuffer.Turbopuffer(
api_key=os.getenv("TURBOPUFFER_API_KEY"), region=SOURCE_REGION
)
backup_client = turbopuffer.Turbopuffer(
api_key=os.getenv("TURBOPUFFER_API_KEY"), region=BACKUP_REGION
)
# Find latest backup timestamp (last 10 chars = Unix epoch seconds)
backups = list(backup_client.namespaces(prefix=BACKUP_PREFIX))
timestamps: set[int] = set()
for ns in backups:
if len(ns.id) >= 10:
try:
timestamps.add(int(ns.id[-10:]))
except ValueError:
pass
latest = max(timestamps)
print(f"Recovering from backup: {latest}")
start_time = time.time()
recovered = 0
latest_suffix = f"{latest:010d}"
for ns in backups:
if not ns.id.endswith(latest_suffix):
continue
original_name = ns.id[len(BACKUP_PREFIX) : -11] # -11 for "-" + 10 digits
recovered_name = f"recovered-py-{original_name}"
print(f" {ns.id} -> {recovered_name}")
source_client.namespace(recovered_name).copy_from(
source_namespace=ns.id, source_region=BACKUP_REGION
)
recovered += 1
print(f"Done: recovered {recovered} namespaces in {time.time() - start_time:.1f}s")For more details on copy_from_namespace, see the write documentation.