Integrating Oracle and Kafka



For all its quirks and licence fees, we all love Oracle for what it does. But sometimes we want to get the data out to use elsewhere. Maybe we want to build analytics on it; perhaps we want to drive applications with it; sometimes we might even want to move it to another non-Oracle database—can you imagine that! 😱

With Apache Kafka as our scalable, distributed event streaming platform, we can ingest data from Oracle as a stream of events. We can use Kafka to transform and enrich the events if we want to, even joining them to data from other sources. We can stream the resulting events to target systems, as well as use them to create event-driven microservices.

This talk will show some basics of Kafka and then dive into ingesting data from Oracle into Kafka, applying stream processing with ksqlDB, and then pushing that data to systems including PostgreSQL as well as back into Oracle itself.

🗣️ As presented at ACEs @ Home meetup on 15th June 2020

📔 Slides and resources:

ℹ️ Table of contents:

1:34 What is Kafka? (see also
10:00 What’re the reasons for integrating Oracle into Kafka?
14:41 Kafka Connect (see also
17:50 The two types of Change Data Capture (CDC)
19:40 Live demo – Oracle into Kafka
24:30 Live demo – Difference between CDC methods illustrated
28:40 Live demo – Streaming data from Kafka to another database (Postgres)
32:59 Live demo – ksqlDB
37:19 Live demo – Joining a stream of events to a table in ksqlDB
40:14 Live demo – Building aggregates in ksqlDB
41:24 Live demo – Creating a sink connector from ksqlDB to Postgres
44:04 Live demo – ksqlDB stream/table duality, push and pull queries
46:29 Live demo – Key/Value lookup against state in ksqlDB using REST API
47:44 CDC recap, how to choose which to use
49:29 ksqlDB overview
52:50 Summary & useful links

☁️ Confluent Cloud ☁️
Confluent Cloud is a managed Apache Kafka and Confluent Platform service. It scales to zero and lets you get started with Apache Kafka at the click of a mouse. You can signup at and use code 60DEVADV for $60 towards your bill (small print:

source

15 thoughts on “Integrating Oracle and Kafka”
  1. Hi! Thanks for the good explanation. When I'm using JDBC Source connector for oracle the columns of the table are coming within the double quotes due to which I was facing insert error at Postgres as column mismatch (already tables are created at Postgres). Can we avoid wrapping up the column names within double quotes. do we need set any other configuration parameter to avoid this?? Thanks for your help in advance.

  2. Does this work with Oracle cloud as well. We are using Informatica CDC today but our source Oracle system is now moving to cloud and Informatica CDC does not seem to work with Oracle cloud. How about Confluent Kafka?

  3. Hi Rmoff, when you say that query based CDC will only pull the records in first poll and last poll and not in between ones, does that mean that if there is some oracle table with high throughput (crores of records per day)… Then if we use JDBC source connector (query based) … That might not pull all records to kafka?.. Because I am facing this problem, in which records are getting missed in between the day, with no other major configuration difference in the source connector.

    Also when the connector polls, (suppose timestamp based)…. So it does like select * from table where timestampcol<=new_poll_time and timestampcol>=last_poll_time.
    So how will this loose records while polling?

    Anyways, it's a great video.

  4. Hi rmoff, Since savepoint creates an oracle create scn it get translated as csn in ogg. Do we have a way to filter out savepoint events from kafka handler? Again Thanks a lot for this great demo..

  5. Hi rmoff, Tgx for yout input. I manage to set up a oracle kafka connector. But getting the following error as I tried to import big table:

    "The message is 1320916 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration."

    I am struggling to set this "max.request.size" the whole day bzt never managed. Where can I set this value. I am not using docker and have confluent-5.5.1
    Thx in advance.

  6. Hi @rmoff, 
    I have question I use Avro to sink with my Oracle DB but I don't know how to put Database Schema. I tried to put in table.name.format Failed. any suggestions to solve this?Btw Greet demo 👍

    Thx b4

  7. Hi, thanks alot for your effort to work on this. I tried to do the sam but stuck on oracle docker image. I could get the docker image from docker hub after login and start the rest from your docker-compose.ylm file except the docker one this seems not to work. I tried to build the docker image for oracle as you pointed out but I wanted to do it in aws ec2 and I stuck there as you can not wget the installation file as this requires authentication. Can you point me out to some solution hier. How did you creat your docker image?

Leave a Reply

Your email address will not be published.

Captcha loading...