Problem statement: Ingest job fails at merge state with following error message in the databricks log 


20/08/18 18:46:43 ERROR DistJobsDriver: org.apache.spark.sql.AnalysisException: Table or view not found: `SFDC`.`RateSheetCollection__c_2020081718160117`; line 1 pos 14;

'GlobalLimit 0

+- 'LocalLimit 0

   +- 'Project [*]

      +- 'UnresolvedRelation `SFDC`.`RateSheetCollection__c_2020081718160117`


    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)

    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:92)

    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:87)

    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:148)

    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:147)

    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:147)

    at scala.collection.immutable.List.foreach(List.scala:392)

    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:147)

    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:147)

    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:147)

    at scala.collection.immutable.List.foreach(List.scala:392)

Root cause: This error can occur when the previous ingest job fails at merge stage and the corresponding metadata is not completely cleared. Hence the current merge is referring the previous cdc tables and errors out saying table not found. Note that this not a common occurrence.

Solution: 

1. Truncate the tables for which ingest now is failing. Truncate job clears all the previous metadata for the tables.

2. Perform "initialise and ingest" (full load) and "ingest" (incremental load) on this table subsequently. 


Applicable Infoworks versions
v3.2