Load data from S3 to RDS using AWS Glue

This video demonstrates on how to load the data from S3 bucket to RDS Oracle database using AWS Glue

Creating RDS Data Source:

Creating S3 Data Source:


22 thoughts on “Load data from S3 to RDS using AWS Glue”
  1. Hi,
    thanks for the tips.
    Follow the steps shown in the video. However, in my case I need to write to an oracle on premises (external database). I created the jdbc connection (it works), I created the crawler pointing to S3(ok), I created the crawler pointing to oracle(ok).
    I created the Glue job reading from s3 and targeting oracle. However, when running the job, it generates the following error:
    Thread-5 WARN JNDI lookup class is not available because this JRE does not support JNDI. JNDI string lookups will not be available, continuing configuration. java.lang.ClassNotFoundException: org.apache.logging.log4j.core.lookup.JndiLookup at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    Thread-5 INFO Log4j appears to be running in a Servlet environment, but there's no log4j-web module available. If you want better web container support, please add the log4j-web JAR to your web archive or server lib directory.
    how to enable log4j?

  2. I am very new to aws and all this kind of stuff. Would I be able to store an image in an s3 bucket and then have some sort of reference of it in my mysql database? I have a question/answer system set up in the db, but I also need to add an image for each question and I've been trying to figure out the best way to do this. I'd appreciate any input about this.

  3. Thanks for this wonderful video. Does glue support delta feed? If so could you please explain how glue handles UPSERT for delta feed?

  4. What is the complete flow of the process? Where does the location "ORCL.JIMMACAULA…" comes from?

  5. Thanks a lot for this video, it's very helpful. The issue I have right now is that I want to run a job on a daily basis to a target Postgresql RDS instance, but the data is being appended to my target table. I want only the data added to my source table to be appended to my target table. In brief, I want to overwrite my target table. I have enabled the bookmark in my job, but it doesn't seem to work correctly. How would you achieve such goal?

  6. This is informative and nice one too. Would like to give you few suggestions from this video. 1) You did not mention about role here "ETL" like what policies required etc., 2) Background music: Try to avoid music while explaining such educational videos 3) Try to prepare a small PPT with bullet points that what you are going to discussing here and topics.

  7. Thanks for making this video. Can you please show how you created this Role 'ETL' for Glue so that Glue can read from S3 and load to RDS (MySQL). Kindly make a video on this. Also, I would request you to please lower the background music or
    remove it altogether.

  8. Very well explained! Wish you'd come up with a complete example to populate the Redshift (real world situation like updates and deletes to the Redshift)

  9. can you please point me to which transform i should use to add a row only if it doesn't exist in RDS table

Leave a Reply

Your email address will not be published.

Captcha loading...