Internal site. Jolli authentication required to view.
Skip to Content
πŸ“– ConceptsCheckpoint Sync

Last Updated: 3/19/2026


Checkpoint Sync

Checkpoint synchronization enables disaster recovery and cross-environment migration by syncing pipeline checkpoints to object storage (S3, GCS, Azure Blob Storage).

Configuration

Configure checkpoint sync in the storage backend:

{ "storage_config": { "path": "/data/pipeline-state" }, "storage": { "backend": { "name": "file", "config": { "sync": { "bucket": "my-bucket/checkpoints", "region": "us-east-1", "provider": "AWS", "access_key": "AKIAIOSFODNN7EXAMPLE", "secret_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY", "push_interval": 300, "retention_min_count": 10, "retention_min_age": 30 } } } } }

Sync Options

Bucket configuration:

  • bucket β€” S3 bucket name (may include path prefix)
  • region β€” AWS region (optional for MinIO)
  • provider β€” Cloud provider: "AWS", "Minio", etc.
  • endpoint β€” Custom endpoint for S3-compatible storage

Authentication:

  • access_key / secret_key β€” Credentials (optional if using IAM roles)

Automatic sync:

  • push_interval β€” Interval in seconds between automatic pushes (optional)

Retention:

  • retention_min_count β€” Minimum checkpoints to retain (default: 10)
  • retention_min_age β€” Minimum age in days before deletion (default: 30)

Manual Sync

Manually sync a checkpoint:

# Sync latest checkpoint uuid = pipeline.sync_checkpoint() # Sync and wait for completion uuid = pipeline.sync_checkpoint(wait=True, timeout_s=600)

Check sync status:

from feldera import CheckpointStatus status = pipeline.sync_checkpoint_status(uuid) if status == CheckpointStatus.Success: print(f"Checkpoint {uuid} synced successfully")

Starting from Synced Checkpoint

Configure the pipeline to start from a checkpoint in object storage:

{ "storage": { "backend": { "name": "file", "config": { "sync": { "bucket": "my-bucket/checkpoints", "start_from_checkpoint": "latest", "fail_if_no_checkpoint": false, "standby": true, "pull_interval": 10 } } } } }

Options:

  • start_from_checkpoint β€” "latest" or specific UUID
  • fail_if_no_checkpoint β€” Fail if checkpoint cannot be fetched (default: false)
  • standby β€” Start in standby mode (default: false)
  • pull_interval β€” Interval to fetch latest checkpoint in standby (default: 10)

Standby Mode

When standby is enabled:

With start_from_checkpoint: "latest":

  • Pipeline continuously fetches the latest checkpoint until activated
  • Useful for hot standby scenarios

With specific UUID:

  • Pipeline fetches that checkpoint once and waits for activation
  • Useful for controlled failover

Activate the pipeline:

pipeline.activate(wait=True, timeout_s=300)

Read-Only Bucket

Specify a fallback checkpoint source:

{ "storage": { "backend": { "name": "file", "config": { "sync": { "bucket": "primary-bucket/checkpoints", "read_bucket": "backup-bucket/checkpoints" } } } } }

The pipeline never writes to read_bucket.

Use Cases

Disaster Recovery

Automatically sync checkpoints for recovery:

{ "storage": { "backend": { "name": "file", "config": { "sync": { "bucket": "backup-bucket/checkpoints", "push_interval": 300 } } } } }

Cross-Environment Migration

Sync checkpoint in production, load in staging:

# In production uuid = prod_pipeline.sync_checkpoint(wait=True) print(f"Synced checkpoint: {uuid}") # In staging (configure to load this checkpoint) staging_pipeline.start()

Hot Standby

Keep a standby pipeline ready:

{ "storage": { "backend": { "name": "file", "config": { "sync": { "bucket": "checkpoints", "start_from_checkpoint": "latest", "standby": true, "pull_interval": 10 } } } } }

What’s Next