Last Updated: 3/19/2026
Checkpoint Sync
Checkpoint synchronization enables disaster recovery and cross-environment migration by syncing pipeline checkpoints to object storage (S3, GCS, Azure Blob Storage).
Configuration
Configure checkpoint sync in the storage backend:
{
"storage_config": {
"path": "/data/pipeline-state"
},
"storage": {
"backend": {
"name": "file",
"config": {
"sync": {
"bucket": "my-bucket/checkpoints",
"region": "us-east-1",
"provider": "AWS",
"access_key": "AKIAIOSFODNN7EXAMPLE",
"secret_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"push_interval": 300,
"retention_min_count": 10,
"retention_min_age": 30
}
}
}
}
}Sync Options
Bucket configuration:
bucketβ S3 bucket name (may include path prefix)regionβ AWS region (optional for MinIO)providerβ Cloud provider:"AWS","Minio", etc.endpointβ Custom endpoint for S3-compatible storage
Authentication:
access_key/secret_keyβ Credentials (optional if using IAM roles)
Automatic sync:
push_intervalβ Interval in seconds between automatic pushes (optional)
Retention:
retention_min_countβ Minimum checkpoints to retain (default: 10)retention_min_ageβ Minimum age in days before deletion (default: 30)
Manual Sync
Manually sync a checkpoint:
# Sync latest checkpoint
uuid = pipeline.sync_checkpoint()
# Sync and wait for completion
uuid = pipeline.sync_checkpoint(wait=True, timeout_s=600)Check sync status:
from feldera import CheckpointStatus
status = pipeline.sync_checkpoint_status(uuid)
if status == CheckpointStatus.Success:
print(f"Checkpoint {uuid} synced successfully")Starting from Synced Checkpoint
Configure the pipeline to start from a checkpoint in object storage:
{
"storage": {
"backend": {
"name": "file",
"config": {
"sync": {
"bucket": "my-bucket/checkpoints",
"start_from_checkpoint": "latest",
"fail_if_no_checkpoint": false,
"standby": true,
"pull_interval": 10
}
}
}
}
}Options:
start_from_checkpointβ"latest"or specific UUIDfail_if_no_checkpointβ Fail if checkpoint cannot be fetched (default:false)standbyβ Start in standby mode (default:false)pull_intervalβ Interval to fetch latest checkpoint in standby (default: 10)
Standby Mode
When standby is enabled:
With start_from_checkpoint: "latest":
- Pipeline continuously fetches the latest checkpoint until activated
- Useful for hot standby scenarios
With specific UUID:
- Pipeline fetches that checkpoint once and waits for activation
- Useful for controlled failover
Activate the pipeline:
pipeline.activate(wait=True, timeout_s=300)Read-Only Bucket
Specify a fallback checkpoint source:
{
"storage": {
"backend": {
"name": "file",
"config": {
"sync": {
"bucket": "primary-bucket/checkpoints",
"read_bucket": "backup-bucket/checkpoints"
}
}
}
}
}The pipeline never writes to read_bucket.
Use Cases
Disaster Recovery
Automatically sync checkpoints for recovery:
{
"storage": {
"backend": {
"name": "file",
"config": {
"sync": {
"bucket": "backup-bucket/checkpoints",
"push_interval": 300
}
}
}
}
}Cross-Environment Migration
Sync checkpoint in production, load in staging:
# In production
uuid = prod_pipeline.sync_checkpoint(wait=True)
print(f"Synced checkpoint: {uuid}")
# In staging (configure to load this checkpoint)
staging_pipeline.start()Hot Standby
Keep a standby pipeline ready:
{
"storage": {
"backend": {
"name": "file",
"config": {
"sync": {
"bucket": "checkpoints",
"start_from_checkpoint": "latest",
"standby": true,
"pull_interval": 10
}
}
}
}
}Whatβs Next
- Fault Tolerance: Learn about checkpointing and recovery guarantees
- Pipeline Configuration: Configure storage backends and other pipeline settings
- Pipeline Lifecycle: Understand standby mode and pipeline state transitions