Problem Statement: Spark based pipeline will fail with the error message

 

'''
[main]:[08:47:35,294] [DEBUG] [AwbLogger] (AwbLogger.java:181) - Skipping log file close
[main]:[08:47:35,295] [INFO ] [BatchPipelineDriver] (BatchPipelineDriver.java:155) - Done unset in 4s
[main]:[08:47:35,540] [INFO ] [BatchPipelineDriver] (BatchPipelineDriver.java:158) - Done spark shutdown in 5s

'''


Root cause: The conflicting antlr-runtime jar will make this job fail and also missing configs in env.sh file located in $IW_HOME/bin directory


Solution:

1. In Edge-Node machine, please navigate to $WI_HOME/bin directory and open the file env.sh


2. Please ensure to add the below properties if missing in env.sh file.

export SPARK_DIST_CLASSPATH=`hadoop classpath`

export SPARK_CONF_DIR=/usr/hdp/current/spark2-client/conf/  [ Please add right path with respect to your installation]export SPARK_HOME=/usr/hdp/current/spark2-client [ Please add right path with respect to your installation]


3. The $IW_HOME/lib/shared/ directory may contain antlr-runtime.jar in it, this might be another reason for conflict of jars . To avoid this , navigate to $IW_HOME/conf directory and open the file conf.properties.


4. In conf.properties file, in df_batch_classpath setting, in the beginning add the antlr-runtime-<version>.jar to the classpath. the path of antlr runtime jar should be taken from hadoop spark classpath. 

example: df_batch_classpath=/usr/hdp/current/spark2-client/jars/antlr4-runtime-4.7.jar [ Please add right path with respect to your installation]


5. stop and restart the hangman services and submit the spark pipeline job.

cd $IW_HOME/bin/ && source env.sh && ./stop.sh hangman && ./start.sh hangman


Thanks,

Sri