Change Data Capture (CDC) and Kafka

@machinelearnbot 

Change Data Capture (CDC) is an approach to data integration that is based on the identification, capture, and delivery of the changes made to data sources, typically relational databases. Change operations can be the INSERT of a new record, an UPDATE or DELETE of an existing record. With Apache Kafka and in particular with the Kafka Connect APIs and the Kafka Connect source connectors available it's very easy to create data pipeline which will capture and deliver changes from an existing RDBMS to a Kafka cluster. From there you can send those changes to downstream systems, typically NoSQL storage systems (such as Cassandra, MongoDB, Couchbase, etc.) or search engines (such as Elasticsearch). It is also possible and advisible to keep changes stored or cached in a Kafka compacted topic, this way if you want to perform parallel joins via Kafka Streams or KSQL, the joins will be done easily and efficiently in parallel with no repartitioning necessary.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found