Skip to content

Data in the New Age

My thoughts on data engineering and related topics

  • Home
  • About
  • Blog
  • Contact

Apache Spark’s fine-grained processing of RDBMS records part 14: Generic JDBC Write Facilities

  • Written byadmin
  • Posted on October 9, 2018October 20, 2018

The previous post introduced the commonalities in implementing the update and insert operations. Now, we translate that insight into some classes presented in this post. First, let’s look at the class sitting at the top…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 13: Motivation for JDBC Write Facilities

  • Written byadmin
  • Posted on October 6, 2018October 15, 2018

Where Are We? So where are we now? Let’s take a step back. We started with the set of requirements for the write operation from a Spark dataframe to a database table. Mainly, we’re required…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 12: SparkSQL Values Retrieval Facility

  • Written byadmin
  • Posted on October 4, 2018October 15, 2018

Now, after all the runtime and operational issues have been addressed, let’s move on to looking at code quality. We need to make it more robust, flexible, and maintainable. In other words, the code need to…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 11: Yet another issue in execSQL()!

  • Written byadmin
  • Posted on October 2, 2018October 15, 2018

The Problem The previous post suggests a fix to execSQL() to handle null values. We are not done yet. Let’s take a look at the main function updateTable() in approach IV: In the call updateDF.foreachPartition()…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 10: Fixing an issue in execSQL()

  • Written byadmin
  • Posted on September 29, 2018October 15, 2018

The Problem As alluded in Approach IV post, that implementation would not work in real production data. What’s going on? Now’s the time to deal with it. Take a look at this key line of…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 9: Benchmark between Approaches III and IV

  • Written byadmin
  • Posted on September 27, 2018October 14, 2018

This is a bonus post. It chronicles an activity that doesn’t generate a concrete outcome, yet reflects the considerations emerging at this point in an engineering project. Motivation Now approach IV is chosen. Approach III…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 8: Approach IV – Granular Level Record Update

  • Written byadmin
  • Posted on September 25, 2018October 14, 2018

Approach III, although the best so far, still doesn’t quite make it. Hence, we now move on to this next one. Summary Each row in the updateDF dataframe gets extracted individually, and explicitly updates the…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 7: Approach III – SQL Batch Update via joining with temp table

  • Written byadmin
  • Posted on September 22, 2018October 14, 2018

The first two approaches don’t cut it. We now move on to the third one, in search of a solution that can reasonably perform the Update operation. Summary We dump the updateDF dataframe to a…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 6: Approach II – Using Database Trigger

  • Written byadmin
  • Posted on September 20, 2018October 16, 2018

We now explore the second approach using the Update operation as the first test. Following is its examination. Summary In a nutshell, this approach writes update data to a temp table, then uses database trigger…

Read More

Apache Spark’s fine-grained processing of RDBMS records part 5: Approach I – Forcing DataFrameWriter API to work

  • Written byadmin
  • Posted on September 18, 2018October 11, 2018

We now start exploring different approaches with the Update operation as the first evaluation criterion. Following is the first potential solution. Summary This first approach is the most naive, which relies most heavily on available…

Read More

Recent Posts

  • Agent <-> MCP Server interaction
  • WC AI Assistant (beta): Agent, MCP, Vibe Coding, AI Engineering
  • WC AI Assistant (beta): From the Product / User Experience perspective
  • “Toward World Cup 2026” Book Series – First Book: Lamine Yamal
  • World Cup AI Assistant

Recent Comments

  • World Cup AI Assistant - Data in the New Age on WC AI Assistant (beta): Agent, MCP, Vibe Coding, AI Engineering
  • David Eristavi on The End is Near – Eps 2: The Return of von Manstein

Archives

  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • April 2024
  • March 2024
  • June 2023
  • May 2023
  • November 2018
  • October 2018
  • September 2018

Categories

  • AI (Engineering,…)
  • Data Engineering
  • My Books
  • My Novels

Meta

  • Register
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Posts pagination

«Previous Posts 1 … 3 4 5 6 Next Posts»
  • LinkedIn
Copyright © 2026 . All rights reserved.
Blogg theme designed by Blogging Theme Styles