Internal site. Jolli authentication required to view.
Skip to Content
🔌 ConnectorsConnectors Overview

Last Updated: 3/19/2026


Connectors Overview

Connectors are the bridges between Feldera pipelines and the outside world. They enable pipelines to ingest data from external sources and send query results to external destinations. This article explains what connectors are, how they work, and what types are available.

What Are Connectors?

A connector is a configuration that links a SQL table or view to an external data source or sink. Feldera supports two types of connectors:

  • Input connectors (sources): Feed data into SQL tables from external systems
  • Output connectors (sinks): Send query results from SQL views to external systems

A single table can have multiple input connectors, allowing you to ingest data from multiple heterogeneous sources simultaneously. Similarly, a single view can have multiple output connectors, sending results to multiple destinations.

Input Connectors (Sources)

Input connectors attach to SQL tables and push data into the pipeline. They continuously monitor their data source and feed changes (inserts, updates, deletes) into the table.

Common input connector types include:

  • Kafka: Read from Kafka topics
  • HTTP GET: Fetch data from HTTP URLs (one-time or periodic)
  • Delta Lake: Read from Delta Lake tables
  • Debezium: Process change data capture (CDC) streams
  • S3: Read files from S3 buckets
  • Data generators: Generate synthetic data for testing

Input connectors are configured in the WITH clause of a CREATE TABLE statement using the 'connectors' attribute.

Example: HTTP GET Connector

CREATE TABLE vendor ( id BIGINT NOT NULL PRIMARY KEY, name VARCHAR, address VARCHAR ) WITH ('connectors' = '[{ "transport": { "name": "url_input", "config": {"path": "https://example.com/vendors.json"} }, "format": {"name": "json"} }]');

This table has one input connector that fetches JSON data from an HTTP URL and inserts it into the table.

Example: Multiple Input Connectors

CREATE TABLE price ( part BIGINT NOT NULL, vendor BIGINT NOT NULL, price INTEGER ) WITH ('connectors' = '[ { "transport": { "name": "url_input", "config": {"path": "https://example.com/prices.json"} }, "format": {"name": "json"} }, { "transport": { "name": "kafka_input", "config": { "topic": "price_updates", "bootstrap.servers": "localhost:9092" } }, "format": {"name": "json"} }]');

This table ingests data from two sources: an HTTP URL (one-time fetch) and a Kafka topic (continuous stream).

Output Connectors (Sinks)

Output connectors attach to SQL views and send query results to external systems. They receive a stream of changes from the view (inserts and deletes) and forward them to the destination.

Common output connector types include:

  • Kafka: Write to Kafka topics
  • HTTP POST: Send data to HTTP endpoints
  • Delta Lake: Write to Delta Lake tables
  • Snowflake: Write to Snowflake tables
  • S3: Write files to S3 buckets

Output connectors are configured in the WITH clause of a CREATE VIEW statement, placed before the AS keyword.

Example: Kafka Output Connector

CREATE VIEW preferred_vendor WITH ( 'connectors' = '[{ "transport": { "name": "kafka_output", "config": { "topic": "preferred_vendors", "bootstrap.servers": "localhost:9092" } }, "format": {"name": "json"} }]' ) AS SELECT part_id, part_name, vendor_id, vendor_name, price FROM price_analysis;

This view sends its results to a Kafka topic. As the view updates incrementally, changes are streamed to Kafka.

Connector Configuration Structure

A connector specification consists of three parts:

1. Generic Attributes

Common to all connectors:

  • name: A unique name for the connector (optional, defaults to unnamed-{index})
  • paused: If true, the connector starts in a paused state (default: false)
  • max_queued_records: Maximum number of records to buffer in memory (default: 1,000,000)
  • max_batch_size: Maximum records to process in a single batch (optional)

2. Transport Configuration

Specifies the data transport mechanism:

"transport": { "name": "kafka_input", "config": { "topic": "my_topic", "bootstrap.servers": "localhost:9092" } }

Available transports include:

  • kafka_input / kafka_output
  • url_input
  • http_output
  • delta_table_input / delta_table_output
  • s3_input
  • datagen (input only)
  • And more

3. Format Configuration

Specifies the data format:

"format": { "name": "json", "config": { "update_format": "insert_delete" } }

Available formats include:

  • json
  • csv
  • parquet
  • avro

Some transports (like Delta Lake) use fixed formats and don’t require a format section.

HTTP Input and Output (Special Case)

Feldera provides special HTTP connectors that work differently from other connectors. These are not configured in SQL but are automatically available for every pipeline:

  • HTTP input: Send data to a pipeline via POST /v0/pipelines/{name}/ingress/{table}
  • HTTP output: Subscribe to view changes via POST /v0/pipelines/{name}/egress/{view}

These endpoints allow you to push data into tables and pull changes from views using simple HTTP requests, without configuring connectors in your SQL.

Connector Orchestration

Feldera provides mechanisms to control when connectors start and stop:

Manual Control

You can pause and resume individual connectors at runtime using the API, Python SDK, or CLI. This is useful for:

  • Temporarily stopping data ingestion
  • Coordinating multiple data sources
  • Debugging and testing

Automatic Orchestration

Connectors support labels and dependencies for automatic orchestration:

  • labels: Tag connectors with labels
  • start_after: Specify labels of connectors that must finish before this one starts

This allows you to build complex ingestion workflows where connectors activate in a specific order based on dependencies.

Output Buffering

For output connectors, you can configure buffering to control how frequently data is written to the destination:

  • enable_output_buffer: Enable output buffering
  • max_output_buffer_time_millis: Maximum time to buffer data before flushing
  • max_output_buffer_size_records: Maximum number of records to buffer before flushing

Output buffering is particularly useful for connectors like Delta Lake that benefit from larger batch writes rather than many small writes.

Available Connector Types

Feldera supports a growing library of connectors. Here’s a summary of available types:

Input Connectors

  • Kafka
  • HTTP GET (URL)
  • Delta Lake
  • S3
  • Debezium (CDC)
  • Data generators (for testing)

Output Connectors

  • Kafka
  • HTTP POST
  • Delta Lake
  • Snowflake
  • S3

If you need a connector that isn’t supported yet, you can request it on the Feldera GitHub repository .

What’s Next