Internal site. Jolli authentication required to view.
Skip to Content
📖 ConceptsPipelines Core Concepts

Last Updated: 3/19/2026


Pipelines: Core Concepts

This article explains the fundamental building blocks of Feldera: what a pipeline is, how tables and views work, what incremental view maintenance means, and how the pipeline lifecycle operates.

What Is a Pipeline?

A pipeline in Feldera is a self-contained unit of computation consisting of:

  • SQL tables that define the schema for input data
  • SQL views that define queries over those tables and other views
  • Connectors that link tables and views to external data sources and sinks
  • Runtime configuration that controls resources, performance, and behavior

A pipeline is created from an SQL program written in standard SQL. Once created, you can start, stop, pause, and resume the pipeline to control when computation happens. While running, the pipeline continuously processes changes to input tables and incrementally updates all dependent views.

Tables vs. Views

Tables

Tables are declared using SQL CREATE TABLE statements. They define the structure of your input data but do not specify where that data comes from—that’s handled by connectors or direct HTTP ingestion.

CREATE TABLE orders ( order_id BIGINT NOT NULL PRIMARY KEY, customer_id BIGINT, amount DECIMAL(10, 2), order_date TIMESTAMP );

Tables can be marked as materialized using the 'materialized' = 'true' attribute in the WITH clause. Materialized tables store their entire contents, allowing you to browse and query them at any time. Non-materialized tables only track changes.

CREATE TABLE orders ( order_id BIGINT NOT NULL PRIMARY KEY, customer_id BIGINT, amount DECIMAL(10, 2), order_date TIMESTAMP ) WITH ('materialized' = 'true');

Views

Views are defined using SQL CREATE VIEW statements. They represent queries over tables and other views. Feldera supports deeply nested hierarchies of views, allowing you to build complex analytical queries in a modular way.

CREATE VIEW daily_revenue AS SELECT DATE_TRUNC('day', order_date) AS day, SUM(amount) AS total_revenue FROM orders GROUP BY DATE_TRUNC('day', order_date);

Like tables, views can be materialized or non-materialized:

  • Materialized views: Feldera stores the entire contents of the view. You can browse and query it at any time using ad-hoc SQL queries.
  • Regular views: You can only observe a stream of changes to the view. You cannot query its current state directly.
CREATE MATERIALIZED VIEW daily_revenue AS SELECT DATE_TRUNC('day', order_date) AS day, SUM(amount) AS total_revenue FROM orders GROUP BY DATE_TRUNC('day', order_date);

Views declared in your SQL program can reference both materialized and non-materialized tables and views. The materialization choice only affects whether you can query the view’s state using ad-hoc queries.

Incremental View Maintenance

Feldera’s core capability is incremental view maintenance (IVM). When new changes arrive in input tables, Feldera automatically updates all dependent views. Rather than re-evaluating queries from scratch, Feldera employs incremental algorithms that compute only what has changed.

How It Works

When you insert, update, or delete records in a table, Feldera:

  1. Identifies which views depend on that table
  2. Propagates the changes through the query execution plan
  3. Computes only the affected portions of each view
  4. Outputs the changes (inserts and deletes) to the view

The cost of processing input changes is proportional to the size of the changes, not the size of the entire dataset. This makes Feldera extremely efficient for continuous computation over large datasets.

Change Streams

Feldera operates on changes. A change is any number of inserts, updates, or deletes to a set of tables. When you observe a view’s output, you see a stream of changes—records being inserted or deleted from the view—not the complete view contents.

For example, if a price update causes the preferred vendor for a part to change, you’ll see:

  • A delete for the old vendor record
  • An insert for the new vendor record

This reflects Feldera’s internal computation model: it propagates changes through the query plan rather than recomputing everything.

Pipeline Lifecycle

A pipeline goes through several states as Feldera allocates compute and storage resources for it. Understanding these states helps you manage pipelines effectively.

Primary States

The main lifecycle states are:

  • Stopped: The pipeline is not running. No compute resources are allocated. Storage may or may not be provisioned.
  • Provisioning: Feldera is allocating compute and storage resources. The SQL program must be successfully compiled before provisioning can begin.
  • Provisioned: Resources are allocated and the pipeline process is running.
  • Stopping: Feldera is deallocating compute resources and shutting down the pipeline.

Runtime States

When a pipeline is in the Provisioned state, it has a runtime status that indicates what the pipeline process is doing:

  • Standby: The pipeline is pulling the latest checkpoint to storage but not processing inputs (Enterprise feature).
  • Initializing: Input and output connectors are establishing connections to their data sources and sinks.
  • Paused: The pipeline is running but input connectors are paused. No new data is being ingested.
  • Running: The pipeline is actively processing input data and updating views.
  • Suspended: The circuit has terminated and a final checkpoint has been made. The pipeline will automatically stop.

Controlling the Lifecycle

You control the pipeline lifecycle through the Web Console, Python SDK, CLI, or REST API:

  • Start: Transitions from Stopped to Provisioning, then to Provisioned with a runtime state (default: Running)
  • Pause: Pauses input connectors while keeping the pipeline running
  • Resume: Resumes paused input connectors
  • Stop: Shuts down the pipeline and deallocates compute resources

When you stop a pipeline, you can choose to keep or clear its storage:

  • Keep storage: The pipeline retains all data and state. When you restart it, it picks up where it left off.
  • Clear storage: All data, state, and computed results are deleted. The next start begins from a clean slate.

Compilation

Before a pipeline can start, its SQL program must be compiled. Compilation happens automatically when you create or modify a pipeline. The compilation process has its own status:

  • Pending: Awaiting compilation
  • CompilingSql: SQL is being compiled to Rust code
  • SqlCompiled: SQL compilation succeeded, awaiting Rust compilation
  • CompilingRust: Rust code is being compiled to an executable
  • Success: Compilation succeeded, the pipeline can be started
  • SqlError/RustError/SystemError: Compilation failed

You can only start a pipeline when its program status is Success.

Strong Consistency

Feldera provides strong consistency guarantees. The state of views always corresponds to what you would get if you ran the same queries in a batch system for the same input. There are no eventual consistency issues or stale reads—when you query a materialized view, you see the exact result of applying all processed input changes.

This guarantee holds even when the pipeline is paused or stopped and restarted. Feldera ensures that no input data is lost or duplicated, and all view updates are correctly computed.

Datasets Larger Than RAM

Feldera is designed to handle datasets that exceed available RAM. It efficiently spills data to disk when memory is full, taking advantage of modern NVMe storage. This allows you to process large datasets on modest hardware without running out of memory.

The spilling mechanism is transparent—you don’t need to configure it or change your SQL. Feldera automatically manages memory and disk usage based on available resources.

What’s Next

  • Connectors Overview: Learn how to connect pipelines to external data sources and sinks like Kafka, HTTP endpoints, and data lakes
  • Python Sdk: Discover how to create and manage pipelines programmatically using the Feldera Python SDK
  • Pipeline Configuration: Configure workers, memory, and storage for production pipelines
  • Fault Tolerance: Understand checkpointing and recovery guarantees