Description:
Incremental ingestion is failing with unable to find the column <column_name> of the target table. Error message in the databricks log looks something like below.
Caused by: org.apache.spark.sql.AnalysisException: Unable to find the column 'iwDerivedColumn' of the target table from the INSERT columns: BUSINESS_UNIT, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, RESOURCE_ID_FROM, BUSINESS_UNIT_GL, JOURNAL_ID, JOURNAL_DATE, UNPOST_SEQ, JOURNAL_LINE, FISCAL_YEAR, ACCOUNTING_PERIOD, ACCOUNT, ALTACCT, DEPTID, OPERATING_UNIT, PRODUCT, FUND_CODE, CLASS_FLD, PROGRAM_CODE, BUDGET_REF, AFFILIATE, AFFILIATE_INTRA1, AFFILIATE_INTRA2, CHARTFIELD1, CHARTFIELD2, CHARTFIELD3, BUS_UNIT_GL_FROM, CURRENCY_CD, STATISTICS_CODE, LEDGER_GROUP, ANALYSIS_TYPE, RESOURCE_TYPE, RESOURCE_CATEGORY, RESOURCE_SUB_CAT, RES_USER1, RES_USER2, RES_USER3, RES_USER4, RES_USER5, TRANS_DT, ACCOUNTING_DT, OPRID, DTTM_STAMP, JRNL_LN_REF, OPEN_ITEM_STATUS, LINE_DESCR, JRNL_LINE_STATUS, JOURNAL_LINE_DATE, FOREIGN_CURRENCY, RT_TYPE, FOREIGN_AMOUNT, RATE_MULT, RATE_DIV, CUR_EFFDT, PROCESS_INSTANCE, PC_DISTRIB_STATUS, GL_DISTRIB_STATUS, PROJ_TRANS_TYPE, PROJ_TRANS_CODE, RESOURCE_STATUS, DESCR, SYSTEM_SOURCE, UNIT_OF_MEASURE, EMPLID, EMPL_RCD, SEQ_NBR, TIME_RPTG_CD, JOBCODE, COMPANY, BUSINESS_UNIT_AP, VENDOR_ID, VOUCHER_ID, VOUCHER_LINE_NUM, APPL_JRNL_ID, PYMNT_CNT, DST_ACCT_TYPE, PO_DISTRIB_STATUS, BUSINESS_UNIT_PO, REQ_ID, REQ_LINE_NBR, REQ_SCHED_NBR, REQ_DISTRIB_NBR, PO_ID, DUE_DATE, LINE_NBR, SCHED_NBR, DISTRIB_LINE_NUM, AM_DISTRIB_STATUS, BUSINESS_UNIT_AM, ASSET_ID, PROFILE_ID, COST_TYPE, BOOK, INCENTIVE_ID, MSTONE_SEQ, CONTRACT_NUM, CONTRACT_LINE_NUM, CONTRACT_PPD_SEQ, BI_DISTRIB_STATUS, BUSINESS_UNIT_BI, BILLING_DATE, INVOICE, REV_DISTRIB_STATUS, BUSINESS_UNIT_AR, CUST_ID, ITEM, ITEM_LINE, ITEM_SEQ_NUM, DST_SEQ_NUM, BUSINESS_UNIT_IN, SCHED_LINE_NO, DEMAND_LINE_NO, INV_ITEM_ID, PAY_END_DT, BUSINESS_UNIT_OM, ORDER_NO, ORDER_INT_LINE_NO, EX_DOC_ID, EX_DOC_TYPE, RESOURCE_QUANTITY, RESOURCE_AMOUNT, BUDGET_HDR_STATUS, KK_AMOUNT_TYPE, KK_TRAN_OVER_FLAG, KK_TRAN_OVER_OPRID, KK_TRAN_OVER_DTTM, BUDGET_OVER_ALLOW, BUDGET_LINE_STATUS, BUDGET_DT, LEDGER, BD_DISTRIB_STATUS, BUSINESS_UNIT_BD, FA_STATUS, ACCOUNTING_DT_Year, ziw_target_timestamp, ziw_is_deleted. INSERT clause must specify value for all the columns of the target table.;
Root Cause:
This happens when CDC brings the column which is not there in the already created target table.
Solution:
To resolve this one needs to truncate, recrawl, and reingest the data for that particular table.
Applicable versions:
All versions of Datafoundry.