Last Updated: 3/19/2026
Overview
Feldera is a fast query engine for incremental computation. It has the unique ability to evaluate arbitrary SQL programs incrementally, making it more powerful, expressive, and performant than existing alternatives like batch engines, data warehouses, stream processors, or streaming databases.
What Problem Does Feldera Solve?
Traditional data processing systems face a fundamental trade-off: batch systems can handle complex queries but are slow to update, while streaming systems provide low latency but are limited in the queries they can express. Feldera eliminates this trade-off through incremental computation—it continuously processes changes to input data and updates query results by only computing what has changed, rather than reprocessing everything from scratch.
This approach makes Feldera incredibly fast, achieving millions of events per second on a laptop, while supporting the full expressiveness of SQL.
Core Concepts
Pipelines
A Feldera pipeline is a set of SQL tables and views. Tables define the schema for input data, while views define queries over those tables and other views. Views can be deeply nested, allowing you to express complex analytical queries in a modular way.
Users start, stop, or pause pipelines to manage and advance computation. While a pipeline is running, you can inspect the results of views at any time.
Tables and Views
Tables are declared using standard SQL CREATE TABLE statements. They define the structure of your input data but don’t specify where that data comes from—that’s handled by connectors.
Views are defined using SQL CREATE VIEW statements. They represent queries over tables and other views. Feldera supports two types of views:
- Regular views: You can observe a stream of changes to the view, but cannot inspect its current contents.
- Materialized views: Feldera stores the entire contents of the view, allowing you to browse and query it at any time.
Incremental View Maintenance
When new changes arrive in SQL tables, Feldera automatically updates all dependent views. Rather than re-evaluating queries from scratch, Feldera employs incremental algorithms that compute only what has changed. The cost of processing input events is proportional to the size of the changes, not the size of the entire dataset.
This incremental approach enables Feldera to handle datasets larger than available RAM by efficiently spilling to disk, taking advantage of modern NVMe storage.
Connectors
Connectors link pipelines to external data sources and sinks. Input connectors (sources) feed data into SQL tables, while output connectors (sinks) send query results to external destinations. Feldera supports a wide range of connectors including Kafka, HTTP, CDC streams, S3, Delta Lake, and more.
A single pipeline can connect multiple heterogeneous sources to multiple destinations, enabling real-time data transformation and analytics as data moves from source to sink.
Who Is Feldera For?
Feldera is designed for data engineers and backend developers building real-time data pipelines. Common use cases include:
- Real-time feature engineering: Computing features for machine learning models as data arrives
- ETL pipelines: Transforming and loading data between systems with low latency
- Incremental analytics: Running complex analytical queries over continuously updating data
- Change data capture: Processing and transforming database change streams
- Real-time dashboards: Powering live analytics and monitoring applications
Key Features
-
Full SQL support: Feldera is the only engine that can evaluate full SQL syntax and semantics completely incrementally. This includes joins, aggregates, GROUP BY, correlated subqueries, window functions, complex data types, time series operators, user-defined functions (UDFs), and recursive queries.
-
Fast out-of-the-box performance: Users report implementing complex use cases in 30 minutes or less and achieving millions of events per second on a laptop without tuning.
-
Datasets larger than RAM: Feldera efficiently handles datasets that exceed available memory by spilling to disk.
-
Strong consistency guarantees: Feldera guarantees that the state of views always corresponds to what you would get if you ran the queries in a batch system for the same input.
-
Extensive connector library: Connect to Kafka, HTTP, CDC streams, S3, Data Lakes, Warehouses, and more.
-
Fault tolerance: Feldera can gracefully restart from the exact point of an abrupt shutdown or crash, picking up where it left off without dropping or duplicating input or output (preview feature).
-
Ad-hoc queries: Run SQL queries on a running or paused pipeline to inspect or debug the state of materialized views.
Architecture
Feldera’s architecture consists of several key components:
- Pipeline Manager: The control plane that manages pipeline lifecycle, compilation, and deployment
- SQL Compiler: Compiles SQL programs into optimized Rust code
- DBSP Runtime: The incremental computation engine that executes compiled pipelines
- Connectors: Input and output adapters for various data sources and sinks
- Web Console: A browser-based interface for creating and managing pipelines
- REST API: Programmatic access to all Feldera functionality
The system is built on a solid mathematical foundation called DBSP (Database Stream Processor), which provides both formal semantics for streaming operators and an algorithm for generating efficient incremental dataflow programs from arbitrary SQL queries.
Getting Started
The fastest way to get started with Feldera is to run it locally using Docker. With a single command, you can have Feldera running and access the Web Console to create your first pipeline. From there, you can define SQL tables and views, connect data sources, and start processing data in minutes.
Feldera also provides a Python SDK and CLI tool for programmatic access, making it easy to integrate into existing workflows and automation pipelines.