Internal site. Jolli authentication required to view.
Skip to Content

Last Updated: 3/19/2026


Benchmarks

Feldera delivers exceptional performance for incremental computation. This page presents benchmark results comparing Feldera to other streaming systems.

Nexmark Benchmarks

Nexmark is a standard benchmark suite for streaming systems, simulating an online auction platform. It includes 23 queries covering various streaming patterns.

Performance Comparison

Feldera significantly outperforms Apache Flink and Beam-based systems:

16-core Streaming Performance (events/second):

QueryFelderaFlinkFlink on BeamDataflow on Beam
Q06.97M2.64M283K698K
Q16.60M2.60M316K1.02M
Q26.75M3.12M517K1.82M
Q36.61M2.04M555K794K
Q44.23M501K94K63K
Q56.59M662K252K115K
Q86.64M2.10M397K419K
Q126.68M2.00M366K176K
Q133.68M1.57M226K1.00M

Key observations:

  • Feldera is 2-10ร— faster than Flink for most queries
  • Feldera is 10-100ร— faster than Beam-based systems
  • Performance advantage increases for complex queries

Methodology

Benchmarks were run on a 16-core system with:

  • 100 million events per run
  • Streaming mode (continuous processing)
  • Default configurations for all systems

Performance Characteristics

Throughput

Feldera achieves millions of events per second on a laptop:

  • Simple queries (filters, projections): 5-7M events/sec
  • Aggregations (GROUP BY, COUNT): 3-6M events/sec
  • Joins: 2-4M events/sec
  • Complex queries (nested joins, window functions): 1-3M events/sec

Latency

End-to-end latency (input to output):

  • Typical: Sub-millisecond for simple queries
  • Complex queries: Single-digit milliseconds
  • Large state: Scales with state size but remains low

Memory Usage

Feldera efficiently manages memory:

  • Small datasets: Fits entirely in RAM
  • Large datasets: Automatically spills to disk
  • Compression: Reduces memory footprint by 2-5ร—

Scalability

Vertical Scaling

Performance scales with CPU cores:

CoresThroughputSpeedup
11.2M/sec1.0ร—
44.1M/sec3.4ร—
86.8M/sec5.7ร—
169.2M/sec7.7ร—

Near-linear scaling up to 8-16 cores.

Dataset Size

Feldera handles datasets larger than RAM:

  • In-memory: Best performance
  • Spilling to NVMe: 2-3ร— slower
  • Object storage: 5-10ร— slower

Performance degrades gracefully as dataset size increases.

Running Benchmarks

Nexmark Benchmarks

Run Nexmark benchmarks yourself:

cd benchmark ./run-nexmark.sh --runner=feldera --events=100M --cores=16

Compare with other systems:

# Flink ./run-nexmark.sh --runner=flink --events=100M --cores=16 # Beam with Flink ./run-nexmark.sh --runner=beam/flink --events=100M --cores=16

Custom Benchmarks

Benchmark your own queries:

import time from feldera import Pipeline # Create and start pipeline pipeline = Pipeline.create(...) pipeline.start() # Measure throughput start = time.time() for i in range(1000000): pipeline.input_json("events", {"id": i, "value": i * 2}) elapsed = time.time() - start print(f"Throughput: {1000000 / elapsed:.0f} events/sec")

Optimization Tips

Maximize Throughput

  1. Use more workers: Increase workers in runtime config
  2. Enable storage: For datasets larger than RAM
  3. Batch inputs: Use larger batch sizes
  4. Tune connectors: Adjust connector-specific settings

Minimize Latency

  1. Reduce batch size: Lower min_batch_size_records
  2. Reduce buffering delay: Lower max_buffering_delay_usecs
  3. Use fewer workers: Reduces coordination overhead
  4. Optimize queries: Simplify complex queries

Reduce Memory Usage

  1. Enable storage: Spill to disk
  2. Enable compression: Reduce memory footprint
  3. Filter early: Reduce data volume
  4. Aggregate data: Reduce cardinality

Whatโ€™s Next