Last Updated: 3/19/2026

Connectors Overview

Connectors are the bridges between Feldera pipelines and the outside world. They enable pipelines to ingest data from external sources and send query results to external destinations. This article explains what connectors are, how they work, and what types are available.

What Are Connectors?

A connector is a configuration that links a SQL table or view to an external data source or sink. Feldera supports two types of connectors:

Input connectors (sources): Feed data into SQL tables from external systems
Output connectors (sinks): Send query results from SQL views to external systems

A single table can have multiple input connectors, allowing you to ingest data from multiple heterogeneous sources simultaneously. Similarly, a single view can have multiple output connectors, sending results to multiple destinations.

Input Connectors (Sources)

Input connectors attach to SQL tables and push data into the pipeline. They continuously monitor their data source and feed changes (inserts, updates, deletes) into the table.

Common input connector types include:

Kafka: Read from Kafka topics
HTTP GET: Fetch data from HTTP URLs (one-time or periodic)
Delta Lake: Read from Delta Lake tables
Debezium: Process change data capture (CDC) streams
S3: Read files from S3 buckets
Data generators: Generate synthetic data for testing

Input connectors are configured in the WITH clause of a CREATE TABLE statement using the 'connectors' attribute.

Example: HTTP GET Connector


CREATE TABLE vendor (
    id BIGINT NOT NULL PRIMARY KEY,
    name VARCHAR,
    address VARCHAR
) WITH ('connectors' = '[{
    "transport": {
        "name": "url_input",
        "config": {"path": "https://example.com/vendors.json"}
    },
    "format": {"name": "json"}
}]');

This table has one input connector that fetches JSON data from an HTTP URL and inserts it into the table.

Example: Multiple Input Connectors


CREATE TABLE price (
    part BIGINT NOT NULL,
    vendor BIGINT NOT NULL,
    price INTEGER
) WITH ('connectors' = '[
{
    "transport": {
        "name": "url_input",
        "config": {"path": "https://example.com/prices.json"}
    },
    "format": {"name": "json"}
},
{
    "transport": {
        "name": "kafka_input",
        "config": {
            "topic": "price_updates",
            "bootstrap.servers": "localhost:9092"
        }
    },
    "format": {"name": "json"}
}]');

This table ingests data from two sources: an HTTP URL (one-time fetch) and a Kafka topic (continuous stream).

Output Connectors (Sinks)

Output connectors attach to SQL views and send query results to external systems. They receive a stream of changes from the view (inserts and deletes) and forward them to the destination.

Common output connector types include:

Kafka: Write to Kafka topics
HTTP POST: Send data to HTTP endpoints
Delta Lake: Write to Delta Lake tables
Snowflake: Write to Snowflake tables
S3: Write files to S3 buckets

Output connectors are configured in the WITH clause of a CREATE VIEW statement, placed before the AS keyword.

Example: Kafka Output Connector


CREATE VIEW preferred_vendor
WITH (
    'connectors' = '[{
        "transport": {
            "name": "kafka_output",
            "config": {
                "topic": "preferred_vendors",
                "bootstrap.servers": "localhost:9092"
            }
        },
        "format": {"name": "json"}
    }]'
)
AS
SELECT 
    part_id,
    part_name,
    vendor_id,
    vendor_name,
    price
FROM price_analysis;

This view sends its results to a Kafka topic. As the view updates incrementally, changes are streamed to Kafka.

Connector Configuration Structure

A connector specification consists of three parts:

1. Generic Attributes

Common to all connectors:

name: A unique name for the connector (optional, defaults to unnamed-{index})
paused: If true, the connector starts in a paused state (default: false)
max_queued_records: Maximum number of records to buffer in memory (default: 1,000,000)
max_batch_size: Maximum records to process in a single batch (optional)

2. Transport Configuration

Specifies the data transport mechanism:


"transport": {
    "name": "kafka_input",
    "config": {
        "topic": "my_topic",
        "bootstrap.servers": "localhost:9092"
    }
}

Available transports include:

kafka_input / kafka_output
url_input
http_output
delta_table_input / delta_table_output
s3_input
datagen (input only)
And more

3. Format Configuration

Specifies the data format:


"format": {
    "name": "json",
    "config": {
        "update_format": "insert_delete"
    }
}

Available formats include:

json
csv
parquet
avro

Some transports (like Delta Lake) use fixed formats and don’t require a format section.

HTTP Input and Output (Special Case)

Feldera provides special HTTP connectors that work differently from other connectors. These are not configured in SQL but are automatically available for every pipeline:

HTTP input: Send data to a pipeline via POST /v0/pipelines/{name}/ingress/{table}
HTTP output: Subscribe to view changes via POST /v0/pipelines/{name}/egress/{view}

These endpoints allow you to push data into tables and pull changes from views using simple HTTP requests, without configuring connectors in your SQL.

Connector Orchestration

Feldera provides mechanisms to control when connectors start and stop:

Manual Control

You can pause and resume individual connectors at runtime using the API, Python SDK, or CLI. This is useful for:

Temporarily stopping data ingestion
Coordinating multiple data sources
Debugging and testing

Automatic Orchestration

Connectors support labels and dependencies for automatic orchestration:

labels: Tag connectors with labels
start_after: Specify labels of connectors that must finish before this one starts

This allows you to build complex ingestion workflows where connectors activate in a specific order based on dependencies.

Output Buffering

For output connectors, you can configure buffering to control how frequently data is written to the destination:

enable_output_buffer: Enable output buffering
max_output_buffer_time_millis: Maximum time to buffer data before flushing
max_output_buffer_size_records: Maximum number of records to buffer before flushing

Output buffering is particularly useful for connectors like Delta Lake that benefit from larger batch writes rather than many small writes.

Available Connector Types

Feldera supports a growing library of connectors. Here’s a summary of available types:

Input Connectors

Kafka
HTTP GET (URL)
Delta Lake
S3
Debezium (CDC)
Data generators (for testing)

Output Connectors

Kafka
HTTP POST
Delta Lake
Snowflake
S3

If you need a connector that isn’t supported yet, you can request it on the Feldera GitHub repository .

What’s Next

Kafka Connector: Deep dive into Kafka input and output connector configuration
PostgreSQL/Debezium Connector: Set up change data capture from PostgreSQL
HTTP Connector: Push and pull data via HTTP for testing and browser apps
Connector Orchestration: Control connector startup order and dependencies
Python SDK: Learn how to configure connectors programmatically using the Feldera Python SDK
CLI Reference: Discover how to manage pipelines and connectors using the command-line tool