Job Specific Logs for Infoworks DataFoundry 3.x,4.x,v5.x on Databricks, GCP, EMR    


Ingestion

How to collect and analyze the Ingestion Job logs in Infoworks DataFoundry 3.x,4.x,v5.x on Databricks, GCP,EMR


Infoworks DataFoundry on Databricks submits a Spark job during the ingestion process. If the ingestion job fails, it might fail at 


  1. The Infoworks End while dispatching the Spark job (Infoworks Driver program)


  1. The Spark job might fail 


Steps to collect the Job logs and to analyze them


i) Download the ingestion job log from Infoworks DataFoundry web UI as shown below.


Note: The same job log can also be located on the Infoworks Edge node location mentioned below.


$IW_HOME/logs/job/job_<job_id>




ii) If you see the failure in the Spark job, click on the Download icon  below the Status tab to get the Databricks job log.


iii) Open the Infoworks Job log and look out for any related ERRORs in the .log and .out file. You can search for the keyword “ERROR” in the log and try to understand the stack trace. Analyze some 10 lines before the ERROR message to understand the flow of the job.


iv) Similarly, open the downloaded Databricks job log and look out for any ERRORs with the Keyword “ERROR”


v) Take a look at the articles and solutions on the IW knowledge base. See below for the link. We have been adding new articles every week. You will be able to search for error messages and find solutions. 

 

You can search for a solution before you open a ticket. 

 

https://infoworks.freshdesk.com/a/solutions

 

Pipeline build

How to collect and analyze the Pipeline build Job logs in Infoworks DataFoundry 3.x,4.x,5.x on Databricks


Infoworks DataFoundry on Databricks submits a Spark job during the pipeline build job. If the job fails, it might fail at 


  1. The Infoworks end while dispatching the Spark job (Infoworks Driver program)


  1. The Spark job might fail


i) Download the Build pipeline job log from the UI as shown below.


The same pipeline build job log can also be located in the below location on the Edge node. $IW_HOME/logs/job/job_<job_id>


ii) If you see the failure in the Spark job, click on the Download icon  below the Status tab to get the Databricks job log.


iii) Open the Infoworks Job log and look out for any related ERRORs in the .log and .out file. You can search for the keyword “ERROR” in the log and try to understand the stack trace. Analyze some 10 lines before the ERROR message to understand the flow of the job.


iv) Similarly, open the downloaded Databricks job log and look out for any ERRORs with the Keyword “ERROR”


v) Interactive jobs such as (Sample Data, Browse the Source schema to onboarding more tables, File preview job for File-based sources) will log the messages in $IW_HOME/logs/ingestion/ingest.log and $IW_HOME/logs/ingestion/ingest.out


The above-mentioned jobs will use the interactive cluster that you set up after the Infoworks install. If any of the above jobs fail, you can look for the below message in the interactive.log to get the cluster id and the run id. Go to the respective cloud environment (GCP, EMR, Databricks for AWS, Azure) to get the corresponding cluster job log in case of failure.


Submitting an application with the following configuration:

{

  "existing_cluster_id" : "0928-075015-nits385",



[INFO] 2021-10-07 09:41:39,800 [pool-8-thread-1] io.infoworks.platform.job.dispatcher.core.cluster.impl.databricks.DatabricksInteractiveCluster:88 :: Started databricks run with id: 34783


vi) Take a look at the articles and solutions on the IW knowledge base. See below for the link. We have been adding new articles every week. You will be able to search for error messages and find solutions. 

 

You can search for a solution before you open a ticket

 

https://infoworks.freshdesk.com/a/solutions


Pipeline Interactive/sample data generation failure


Whenever you run a preview data job or if you see any errors in the Pipeline Editor page while creating a pipeline in editor mode, an interactive job would be executed in the background and you can collect the logs for this from the below location on the DataFoundry Edge node.


i) $IW_HOME/logs/df/job_<job_id>


Note: The sample data generation will not have a job id shown in UI. You need to check the latest log file generated in the above directory.


ii) $IW_HOME/logs/dt/<interactive.out> (If it exists)


Workflow Failure logs


a) Whenever a workflow run fails at a particular node, select the failed run, go to the failed node, and click on View Task logs, copy the content into a text file and check the ERROR messages.



b) If it is an ingest node or a pipeline build node that fails during the job execution, collect the job logs as well along with the task log by clicking on View Job Logs as shown below.



It will take you to the ingestion/pipeline job log page and click on Download to collect the job logs.



c) Share the below logs from the DataFoundry Edge node along with the above logs.


$IW_HOME/orchestrator-engine/airflow-scheduler.log

$IW_HOME/orchestrator-engine/airflow-webserver.err

$IW_HOME/orchestrator-engine/airflow-webserver.out

$IW_HOME/orchestrator-engine/airflow-worker.out

$IW_HOME/orchestrator-engine/airflow-scheduler.out

$IW_HOME/orchestrator-engine/airflow-worker.err

$IW_HOME/orchestrator-engine/airflow-scheduler.err



Locations for all the Service Logs in IWX DataFoundry on Databricks


Starting from IWX DataFoundry v2.7.x, there are some new IWX services added.


Platform

Configuration service

Notification Service

Ingestion Service (Only for DataFoundry on Databricks v3.x,v4.x)


Service logs

Location

Service info for UI

$IW_HOME/logs//apricot/apricot-out.log

Service info for Governor

$IW_HOME/logs//governor/governor.log

Service info for Hangman

$IW_HOME/logs//hangman/hangman.log

Service info for Rest API

$IW_HOME/logs//rest-api/iw-rest-api.log

Service info for DT (Data Transformation)

 $IW_HOME/logs/dt/interactive.out

Service info for Monitoring Service

$IW_HOME/logs/monitor/monitor.log

Service info for Postgres

$IW_HOME/logs/pgsql.log

Service info for Orchestrator Web Server

$IW_HOME/logs//orchestrator/orchestrator.log

Service info for Orchestrator Engine Web Server

$IW_HOME/orchestrator-engine/airflow-webserver.log

Service info for Orchestrator Engine Scheduler

$IW_HOME/orchestrator-engine/airflow-scheduler.log

Service info for Orchestrator Engine Worker

$IW_HOME/orchestrator-engine/airflow-worker.log

Service info for RabbitMQ

$IW_HOME/logs/rabbit*.log

Service info for Nginx

$IW_HOME/resources/Nginx-portable/logs/*.log

Service info for MongoDB

$IW_HOME/logs/mongod.log

Service info for Platform Services

$IW_HOME/logs/platform/platform-server.log

Service info for Notification User Consumer

$IW_HOME/logs/platform/artifact-consumer.log

Service info for Notification Artifact Consumer

$IW_HOME/logs/platform/user-consumer.log

Service info for Configuration Service

$IW_HOME/logs/platform/config-service.log

Service info for Ingestion Service (Only for DataFoundry on Databricks v3.x,v4.x)

$IW_HOME/logs/ingestion/ingestion.log