Incremental ingestion fails with "cannot perform MERGE as multiple source rows matched and attempted to update the same target row in the Delta table" error at merge phase. : Infoworks

Problem Description

Incremental ingestion fails at the merge phase if the CDC data has multiple records for the given natural key with the below error.

20/06/05 05:41:30 ERROR Merger: java.lang.UnsupportedOperationException: Cannot perform MERGE as multiple source rows matched and attempted to update the same target row in the Delta table.

20/06/05 05:41:30 ERROR DistJobsDriver: io.infoworks.saas.ingestion.commons.exception.IWException: java.lang.UnsupportedOperationException: Cannot perform MERGE as multiple source rows matched and attempted to update the same target row in the Delta table.

at io.infoworks.saas.ingestion.jobs.merger.Merger.databricksMerge(Merger.java:147)

at io.infoworks.saas.ingestion.jobs.merger.Merger.runMerge(Merger.java:217)

at io.infoworks.saas.ingestion.jobs.merger.Merger.runJob(Merger.java:204)

at io.infoworks.saas.ingestion.jobs.DistJobsDriver.submitMergeJob(DistJobsDriver.java:87)

at io.infoworks.saas.ingestion.jobs.DistJobsDriver.runJobs(DistJobsDriver.java:51)

at io.infoworks.saas.ingestion.jobs.DistJobsDriver.main(DistJobsDriver.java:28)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command--1:1)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw$$iw$$iw.<init>(command--1:44)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw$$iw.<init>(command--1:46)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw.<init>(command--1:48)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw.<init>(command--1:50)

Cause

This happens if the CDC job brings multiple records for the given natural key.

Solution

If the CDC table has the duplicate records then set a configuration at a table level to run dedupe on the CDC table before the merge. This will dedupe the CDC table using the natural key and ZIW_TARGET_TIMESTAMP column. Set the below key-value pair at the table level.

Key  :   should_run_dedupe
Value:   true

Applicable Infoworks versions

Infoworks Datafoundry v3.x on Databricks

Incremental ingestion fails with "cannot perform MERGE as multiple source rows matched and attempted to update the same target row in the Delta table" error at merge phase. Print

Related Articles