Last Updated: 3/19/2026
Pipeline Configuration
Feldera pipelines expose a comprehensive set of runtime configuration options that control how your pipeline executes, manages resources, and handles failures. These settings are specified in the runtime_config section when creating or modifying a pipeline.
Core Configuration Options
Worker Threads
The workers parameter controls the number of DBSP worker threads allocated to the pipeline. Each worker thread is paired with a background thread for LSM merging, effectively doubling the total thread count.
{
"workers": 8
}The typical sweet spot is between 4 and 16 workers. Each worker increases memory consumption for data structures used during pipeline steps. The default is 8 workers.
Storage Configuration
Storage determines whether pipeline state is kept in memory or persisted to disk. When storage is set, the pipeline can work with datasets larger than available RAM and supports checkpointing for fault tolerance.
{
"storage": {
"backend": {
"name": "default"
},
"min_storage_bytes": 10485760,
"compression": "default",
"cache_mib": 1024
}
}Storage backend options:
default— Uses the local file system (current default)file— Explicitly uses local file system with optional async I/O configurationobject— Uses object storage (S3, GCS, Azure Blob Storage)
Storage parameters:
min_storage_bytes— Minimum estimated bytes before writing a batch to storage (default: 10 MiB)min_step_storage_bytes— Minimum bytes for step batches (default: effectively disabled)compression— Compression algorithm:default,none, orsnappycache_mib— Maximum in-memory cache size in MiB (if unset, each thread gets 256 MiB)
When storage is disabled (set to null), all state is kept in memory, which is faster but limits dataset size to available RAM.
Storage Cache Configuration
The cache field within storage_config controls how storage access is cached:
{
"storage_config": {
"path": "/data/pipeline-state",
"cache": "page_cache"
}
}page_cache— Uses the operating system’s page cache (default, currently better performance)feldera_cache— Uses Feldera’s internal cache implementation (under development)
Fault Tolerance
Fault tolerance enables pipelines to recover from crashes without data loss. Configure it using the fault_tolerance field:
{
"fault_tolerance": {
"model": "exactly_once",
"checkpoint_interval_secs": 60
}
}Fault tolerance models:
exactly_once— Each record is output exactly once (default when fault tolerance is enabled)at_least_once— Each record is output at least once; crashes may duplicate outputnone— Disables fault tolerance (setmodelto"none"or omit thefault_tolerancefield entirely)
Checkpoint interval:
The checkpoint_interval_secs parameter controls how often automatic checkpoints are created (default: 60 seconds, range: 1–3600). Set to null to disable periodic checkpointing.
Fault tolerance requires storage to be enabled and uses the configured storage backend to persist checkpoints and logs.
Memory and Resource Limits
Memory Configuration
Control memory usage with the resources field:
{
"resources": {
"cpu_cores_min": 2.0,
"cpu_cores_max": 8.0,
"memory_mb_min": 2048,
"memory_mb_max": 16384,
"storage_mb_max": 102400,
"storage_class": "fast-nvme"
}
}These limits are enforced only in Feldera Cloud deployments. In self-hosted environments, they serve as documentation of expected resource usage.
Batch Size and Buffering
Control how input data is batched and buffered:
{
"min_batch_size_records": 1000,
"max_buffering_delay_usecs": 100000
}min_batch_size_records— Minimum records to buffer before processing (default: 0)max_buffering_delay_usecs— Maximum delay in microseconds to wait for the minimum batch size (default: 0)
The controller delays pushing input records to the circuit until either the minimum batch size is reached or the maximum buffering delay has elapsed.
Advanced Configuration
CPU Pinning
Pin worker threads to specific CPU cores for better performance and consistency:
{
"pin_cpus": [0, 1, 2, 3, 4, 5, 6, 7]
}Specify at least twice as many CPU numbers as workers. CPU pinning works best when different pipelines on the same machine are pinned to different CPUs.
Clock Resolution
For queries using the NOW() function, control how often the clock is updated:
{
"clock_resolution_usecs": 1000000
}The pipeline updates the clock value and triggers recomputation at most every clock_resolution_usecs microseconds (default: 1 second). This setting is ignored if the query doesn’t use NOW().
Connector Initialization
Control how many connectors are initialized in parallel during startup:
{
"max_parallel_connector_init": 10
}At startup, the pipeline initializes all input and output connectors. This setting controls the maximum number of connectors initialized concurrently (default: 10).
Thread Pools
Configure the number of threads for HTTP and I/O operations:
{
"http_workers": 8,
"io_workers": 8
}http_workers— Runtime threads for the HTTP server (default: same asworkers)io_workers— Runtime threads for async I/O tasks (default: same asworkers)
These settings rarely need adjustment but can help if ingress, egress, or ad-hoc queries become bottlenecks.
Profiling and Tracing
Enable profiling and distributed tracing:
{
"cpu_profiler": true,
"tracing": false,
"tracing_endpoint_jaeger": "127.0.0.1:6831"
}cpu_profiler— Enable CPU profiler (default:true)tracing— Enable pipeline tracing (default:false)tracing_endpoint_jaeger— Jaeger endpoint for trace data
Logging
Control log filtering with the logging field:
{
"logging": "info,feldera=debug"
}This accepts tracing-subscriber filter syntax . If unset or invalid, messages at “info” severity and higher are logged.
Environment Variables
Inject custom environment variables into the pipeline process:
{
"env": {
"MY_CUSTOM_VAR": "value"
}
}Reserved variable namespaces (FELDERA_, KUBERNETES_, TOKIO_, RUST_LOG) cannot be overridden.
Configuration Example
Here’s a complete example of a production-ready pipeline configuration:
{
"workers": 16,
"storage": {
"backend": {
"name": "default"
},
"compression": "snappy",
"cache_mib": 4096
},
"storage_config": {
"path": "/data/pipeline-state",
"cache": "page_cache"
},
"fault_tolerance": {
"model": "exactly_once",
"checkpoint_interval_secs": 120
},
"resources": {
"cpu_cores_min": 8.0,
"cpu_cores_max": 16.0,
"memory_mb_min": 8192,
"memory_mb_max": 32768,
"storage_mb_max": 204800
},
"min_batch_size_records": 10000,
"max_buffering_delay_usecs": 500000,
"cpu_profiler": true,
"logging": "info,feldera=debug"
}This configuration allocates 16 workers, enables storage with Snappy compression, configures exactly-once fault tolerance with 2-minute checkpoint intervals, and sets resource limits appropriate for a medium-scale deployment.
What’s Next
- Pipeline Lifecycle: Learn about pipeline states and transitions
- Fault Tolerance: Deep dive into checkpoint and recovery mechanisms
- Memory Management: Understand how Feldera manages memory and spills to disk