Problem Description


Generic Rest API ingestion job fails with the below ERROR in the ingestion job log.


Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/launcher/SparkLauncher
    at infoworks.discovery.filecrawler.generic.Crawler.crawl(Crawler.java:422)
    at infoworks.discovery.filecrawler.generic.Crawler.doCrawl(Crawler.java:305)
    at infoworks.discovery.filecrawler.generic.Crawler.doCrawl(Crawler.java:291)
    at infoworks.discovery.filecrawler.generic.Crawler.main(Crawler.java:319)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)


Root cause

This issue happens if the spark_launcher jar is missing in the iw_jobs_classpath in $IW_HOME/conf/conf.properties file.


Solution


a) Append /usr/hdp/2.6.5.3003-25/spark2/jars/spark-launcher_2.11-2.3.0.2.6.5.3003-25.jar to the beginning of the iw_jobs_classpath in $IW_HOME/conf/conf.properties file as shown below.


iw_jobs_classpath=/usr/hdp/2.6.5.3003-25/spark2/jars/spark-launcher_2.11-2.3.0.2.6.5.3003-25.jar:/opt/infoworks/lib/extras/ingestion/*:/opt/infoworks/lib/schemacrawler-12.06.03-main/schemacrawler-12.06.03.jar:/opt/infoworks/lib/mongodb/mongo-java-driver-3.8.0.jar:/opt/infoworks/bin/tools.jar:/opt/infoworks/lib/exec/commons-exec-1.2.jar:/opt/infoworks/lib/teradata/tdgssconfig.jar:/opt/infoworks/lib/teradata/teradata-connector-1.1.1.jar:/opt/infoworks/lib/teradata/terajdbc4.jar:/opt/infoworks/lib/mongodblogger/logger.jar:/opt/infoworks/lib/mongodblogger/log4mongo-java-0.7.4.jar:/opt/infoworks/lib/antlr/*:/opt/infoworks/lib/jackson/*:/opt/infoworks/lib/jsqlparser/*:/opt/infoworks/lib/shared/*:/opt/infoworks/lib/jwt/*:/opt/infoworks/lib/parquet-support/parquet-hadoop-bundle-1.6.0.jar:/opt/infoworks/lib/parquet-support/1hive-exec-1.2.1.jar:/usr/hdp/current/hive-client/lib//hive-jdbc.jar:/usr/hdp/current/hive-client/lib//hive-service.jar:/usr/hdp/current/hive-client/lib//hive-metastore.jar:/usr/hdp/current/hive-webhcat/share/hcatalog/*:/etc/hive/conf:/opt/infoworks/lib/commons-lang3/*:/usr/hdp/current/hive-client/lib/ant-1.9.1.jar:/opt/infoworks/platform/bin/notification-common.jar:/opt/infoworks/platform/bin/platform-common.jar:/opt/infoworks/platform/lib/notification-client/*:/opt/infoworks/lib/ignite/*:/opt/infoworks/lib/ingestion/commons-configuration-1.10.jar


Note: spark-launcher_2.11-2.3.0.2.6.5.3003-25.jar version might vary with the HDP version. Provide the spark-launcher jar which is present in $SPARK_HOME/jars directory on your Infoworks Data Foundry Edge node. Validate the path and the jar name before adding it to iw_jobs_classpath


b) Save the conf.properties file and run the ingestion job. This should resolve the issue.


Applicable Infoworks DataFoundry Versions:


IWX 2.8.x,2.9.x,3.1.x