Solving Delta Table Concurrency Issues: Practical Code Solutions & Insights

Canadian Data Guy
5 min readSep 29, 2023

Delta Lake is a powerful technology for bringing ACID transactions to your data lakes. It allows multiple operations to be performed on a dataset concurrently. However, dealing with concurrent operations can sometimes be tricky and may lead to issues such as `ConcurrentAppendException`, `ConcurrentDeleteReadException,` and `ConcurrentDeleteDeleteException.` In this blog post, we will explore why these issues occur and how to handle them effectively using a Python function, and how to avoid them with table design and using isolation levels and write conflicts.

Why Do These Issues Happen?

  • Concurrent Append Exception (`ConcurrentAppendException`):**
    This error happens when another operation is adding files to the same section (or any section in a table without partitions) that your operation is reading from. These file additions can happen due to INSERT, DELETE, UPDATE, or MERGE operations. By default, with the WriteSerializable isolation level, adding files without checking any data (known as blind INSERT operations) won’t cause any issues with any operation, even if they are working on the same section (or any section in a table without partitions). However, if the isolation level is changed to Serializable, then these blind additions may cause conflicts. This error is…

--

--

Canadian Data Guy

https://canadiandataguy.com | Data Engineering & Streaming @ Databricks | Ex Amazon/AWS | All Opinions Are My Own