PROBLEM DESCRIPTION


Please refer to this article if the pipeline build fails with the below error 

[main]:[14:53:04,739] [DEBUG] [17ef4473-b170-40aa-80f1-46433d4d14f2] 
BatchPipelineDriver:156 - Adding control table record ControlTable{id='null', 
jobId='e70700dc1da4dd1da5ddf788', jobType='AWB_BATCH', startDate=Fri May 04 
14:53:00 MST 2018, endDate=Fri May 04 14:53:04 MST 2018, watermark=null, 
success=false, failureReason='java.lang.RuntimeException: 
io.infoworks.awb.PipelineExecutionException: Schema mismatch, column 
duedt_ext_start_date not the 32 column in the table!: properties


CAUSE


This issue occurs if there is any change in the source table metadata after the pipeline is built with Target Sync type in Merge mode.


For instance, a pipeline is built with merge mode and later a new column is fetched in the source table by a metadata crawl and if we run the build pipeline again with this newly added column, it fails with schema mismatch error.



During the pipeline build process, we create a temporary table in hive and we will merge the data with the actual target data. So if there is any change in the table metadata between this temp table and the actual target table, the subsequent pipeline build would fail.



WORKAROUND/ RESOLUTION


We need to change the Target sync type to OVERWRITE and then build the pipeline again so that the Target table would be dropped and recreated with this newly added column. This should resolve the issue.