Please refer this article if the ingestion job fails during the crawling state with the below error. 

[ERROR] 2018-03-20 21:23:09,120 [main] infoworks.discovery.dbcrawler.rdbms.CrawlController:750 :: Error while creating crawl Threads. java.lang.NoClassDefFoundError: parquet/hadoop/api/ReadSupport at at at infoworks.discovery.dbcrawler.rdbms.utils.CrawlWorkerThread.<init>( at infoworks.discovery.dbcrawler.rdbms.CrawlController.getCrawlWorkerThread( at infoworks.discovery.dbcrawler.rdbms.CrawlController.crawlTable( at infoworks.discovery.dbcrawler.rdbms.CrawlController.importDB( at infoworks.discovery.dbcrawler.rdbms.CrawlController.doCrawl( at infoworks.discovery.main.Main.startCrawl( at infoworks.discovery.main.Main.main(


This issue occurs if the parquet related jars are missing in the iw_jobs_classpath.


Perform the below steps to overcome the issue,

  1. Go to $IW_HOME/conf ($IW_HOME refers to the infoworks home directory)

        2.Open the file and comment the iw_jobs_classpath entry.


        3.Uncomment the iw_jobs_classpath entry which is related to parquet format under commented section shown 

#The following commented iw_jobs_classpath is for parquet support
#If enabled please ensure that only one configuration with the key iw_jobs_classpath is enabled

#Ensure that all the paths in the classpath are correct including the path for hive jars
#Also ensure any changes made to the original iw_jobs_classpath is copied here as well


        4.Add the below entry to the iw_jobs_classpath and run the ingestion job again.



IWX 2.3.0