The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
May 5, 2025 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Postgres to Elasticsearch/OpenSearch sync
Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)
Repo for CDC with debezium blog post
Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered hive table performance comparison
Sample project that describes how you can handle schema within your Django application.
Example pipeline to stream the data changes from RDBMS to Apache Iceberg tables
Keep in sync RDB table with Hive structured store. Added Kafka as a buffer between those two tables.
This is a tryout I prepared to demonstrate CDC (change data capture) using MySQL, Maxwell and Kafka.
This project create data stream from mysql using replication protocols and ingest into kafka. You can create event driven system using this.
A provider-agnostic framework to evaluate ordinary CDC (Change Data Capture) features
Change Data Capture (CDC) tool from any source(s) to any target
Data decoding, encoding, conversion, and translation utilities.
Transactional change feeds for SQLite
The Yelp Data Pipeline processes business reviews using Python, Kafka, AWS (DynamoDB, S3, Redshift), PySpark, AWS Lambda, and Power BI. It supports real-time streaming, CDC, daily batch processing, and data visualization for insights into customer sentiment, business performance, and industry trends.
Distributed change data capture (CDC) framework for Google BigQuery
Showcasing CDC with PostgresSQL pglogical plugin and custom scripts.
Real-time data engineering pipeline for an American hiring platform
Showcasing CDC with PostgresSQL Logical Replication.
This project shows how to capture changes from postgres database and stream them into kafka
Add a description, image, and links to the change-data-capture topic page so that developers can more easily learn about it.
To associate your repository with the change-data-capture topic, visit your repo's landing page and select "manage topics."