The Guide for Infoworks DataFoundry logs (GCP, HDI, CDH, AZURE, EMR) : Infoworks

Job Specific Logs for Infoworks DataFoundry on v2.x,v3.x(GCP,HDI,CDH,AZURE,EMR)

Ingestion

How to collect and analyze the Ingestion Job logs in Infoworks DataFoundry v2.x,v3.x (GCP,HDI,CDH,AZURE,EMR)

Ingestion job in Infoworks DataFoundry versions v2.x,v3.x (Azure, HDI, CDH, GCP, EMR) submits a map-reduce job in the customer’s Hadoop cluster to extract the data from the Source System and to ingest it into the hive.

If the ingestion job fails, collect the below logs.

Click on the Download logs option to collect the Ingestion job logs.

Unzip the downloaded zip file and look out for any errors in the .out and .log files. Open the Infoworks Job log and look out for any related ERRORs in the .log and .out file. You can search for the keyword “ERROR” in the log and try to understand the stack trace. Analyze some 10 lines before the ERROR message to understand the flow of the job.

You can also check the source configuration details from the job_object.json file, present in the same zip directory.

If you do not see any ERRORs in the .out file and the .log file then click on the MR Jobs tab in the UI and see if the Map-Reduce job that is submitted has failed.

If you see the Status as shown below under the MR Jobs tab and if it doesn’t show 100% completion for Mapper or the Reduce phase then the map-reduce job might have failed.

You can get the corresponding map-reduce application job log id from the MR Jobs section (1603205573328_0076) or if you open the ingestion job<id>.log file and look out for the below message in the log.

[pool-6-thread-1] org.apache.hadoop.mapreduce.Job:1294 :: The url to track the job: http://cs-internal-headnode.c.gcp-cs-shared-resources.internal:8088/proxy/application_1603205573328_0076/

Login to the Hadoop Resource Manager and look for the above application job id, open the logs, and look out for the error messages.
You can also collect the same map-reduce application job log from the Infoworks DataFoundry edge node by running the below command as the user who starts Infoworks services.

yarn logs -applicationId <job_name>

For example

yarn logs -applicationId application_1603205573328_0076 > application_log.txt

Share the application_log.txt

Refer to the below article on how to analyze the logs to find out the status of the map/reduce tasks, job status transitions, java exception messages, and any kind of informational or warning messages by tracing syslogs of applications.

http://hadooptutorial.info/tracing-logs-through-yarn-web-ui/

Take a look at the articles and solutions on the IW knowledge base. See below for the link. We have been adding new articles every week. You will be able to search for error messages and find solutions.

You can search for a solution before you open a ticket.

https://infoworks.freshdesk.com/a/solutions

Pipeline build

How to collect and analyze the Pipeline build Job logs in Infoworks DataFoundry v2.x,v3.x (GCP,HDI,CDH,AZURE,EMR)

Infoworks DataFoundry pipeline build jobs might run with Hive or Spark as the underlying execution engines in v2.x,v3.x (Azure, HDI, CDH, GCP, EMR)

If the pipeline build fails, collect, and analyze the below logs.

a) Download the pipeline build job log from Infoworks DataFoundry web UI as shown below.

Hive Execution Engine

b) If the execution Engine is Hive and if the pipeline build job fails while running a Tez job, look for the below message in the pipeline build job_<job_id>.log

UUID:dfdb15a4-2bf9-4e3f-a002-a192c0781fee INFO : Status: Running (Executing on YARN cluster with App id application_1603205573328_1198)

Login to the Hadoop Resource Manager and look for the above application job id, open the logs, and look out for the error messages.
You can also collect the same application job log from the Infoworks DataFoundry edge node by running the below command as the user who starts Infoworks services.

yarn logs -applicationId <job_name>

For example

yarn logs -applicationId application_1603205573328_1198 > application_log.txt

Share the application_log.txt

Refer to the below article for more information on how to analyze the Tez application.

https://cwiki.apache.org/confluence/display/TEZ/How+to+Diagnose+Tez+App

Spark Execution Engine

If the Execution Engine is Spark for the pipeline build job and if the underlying Spark job is failing, get the Spark Application job log from the Resource Manager page or run the below command from the Infoworks Edge node to get the Spark application job id.

yarn application -list -appStates ALL | grep <piepeline_build_job_id>

You would see the output as shown below.

application_1600952462285_19507 DF_InteractiveApp_b2538de45dc7ddc81f1d8cce SPARK ceinfoworks default FINISHED SUCCEEDED 100% cs-internal-workernode3.c.gcp-cs-shared-resources.internal:18081/history/application_1600952462285_19507/1

Get the application_id and run the below yarn command to get the Spark application job log.

yarn logs -applicationId application_1600952462285_19507

Take a look at the articles and solutions on the IW knowledge base. See below for the link. We have been adding new articles every week. You will be able to search for error messages and find solutions.

You can search for a solution before you open a ticket

https://infoworks.freshdesk.com/a/solutions

Pipeline Interactive/sample data generation failure

Whenever you run a preview data job or if you see any errors in the Pipeline Editor page while creating a pipeline in editor mode, an interactive job would be executed in the background and you can collect the logs for this from the below location on the DataFoundry Edge node.

i) $IW_HOME/logs/df/job_<job_id>

Note: The sample data generation will not have a job id shown in UI. You need to check the latest log file generated in the above directory.

ii) $IW_HOME/logs/dt/<interactive.out> (If it exists)

Workflow Failure logs

a) Whenever a workflow run fails at a particular node, select the failed run, go to the failed node, and click on View Task logs, copy the content into a text file and check the ERROR messages.

b) If it is an ingest node or a pipeline build node that fails during the job execution, collect the job logs as well along with the task log by clicking on View Job Logs as shown below.

It will take you to the ingestion/pipeline job log page and click on Download to collect the job logs.

c) Share the below logs from the DataFoundry Edge node along with the above logs.

$IW_HOME/orchestrator-engine/airflow-scheduler.log

$IW_HOME/orchestrator-engine/airflow-webserver.err

$IW_HOME/orchestrator-engine/airflow-webserver.out

$IW_HOME/orchestrator-engine/airflow-worker.out

$IW_HOME/orchestrator-engine/airflow-scheduler.out

$IW_HOME/orchestrator-engine/airflow-worker.err

$IW_HOME/orchestrator-engine/airflow-scheduler.err

Cube build jobs

i) Download the cube build job logs from the UI using the Download Logs option as shown below.

Collect the job log from the below location on the Edge node.

$IW_HOME/logs/job/job_<job_id>

Along with the job log, share the below logs.

ii) $IW_HOME/cube-engine/logs/iw_cube.log

iii) $IW_HOME/cube-engine/logs/iw_cube.out

iv) IW_HOME/logs/cube-engine/access-server.log

Scheduled Jobs not getting triggered in Infoworks DataFoundry.

i) Platform service is responsible for the scheduled jobs(Workflows, Ingestion, Pipeline build) to run.

ii) If the scheduled workflows or the Ingestion & Pipeline jobs are not running on time, check the log below log.

$IW_HOME/platform/logs/scheduler-service.log

iii) Look for messages as shown below to check if the workflows are getting triggered.

[QuartzScheduler_Worker-1] job.IWSingleJob:19 : Executing command: /opt/infoworks/apricot-meteor/infoworks_python/infoworks/cli/infoworks.sh workflow-execute --workflow-action run --workflow Ingest_Tables --domain gd_domain -at AU66mMKxL9fEwHEUe9cMnlK9RTDNLkIfp7vcYdT9cSs=

Infoworks DataFoundry Service logs

For Data Foundry Versions v2.4.x, v2.5.x, v2.6.x

Locations for all the Service Logs in IWX DataFoundry

Service logs	Location
Service info for UI	$IW_HOME/logs/apricot/apricot-out.log
Service info for Governor	$IW_HOME/logs/governor/governor.log
Service info for Hangman	$IW_HOME/logs/hangman/hangman.log
Service info for Rest API	$IW_HOME/logs/rest-api/iw-rest-api.log
Service info for Scheduler Service	$IW_HOME/RestAPI/apache-tomcat-7.0.63/logs/catalina.out
Service info for Data Transformation	$IW_HOME/df/apache-tomcat-8.0.33/logs/catalina.out
Service info for Cube Engine	$IW_HOME/cube-engine/logs/kylin.log
Service info for Monitoring Service	$IW_HOME/logs/monitor/monitor.log
Service info for Postgres	$IW_HOME/logs//pgsql.log
Service info for Orchestrator Web Server	$IW_HOME/logs//orchestrator/orchestrator.log
Service info for Orchestrator Engine Webserver	$IW_HOME/orchestrator-engine/airflow-webserver.log
Service info for Orchestrator Engine Scheduler	$IW_HOME/orchestrator-engine/airflow-scheduler.log
Service info for Orchestrator Engine Worker	$IW_HOME/orchestrator-engine/airflow-worker.log
Service info for RabbitMQ	$IW_HOME/logs//rabbit*.log
Service info for Nginx	$IW_HOME/resources/nginx-portable/logs/*.log
Service info for MongoDB	$IW_HOME/logs/mongod.log

For Data Foundry Versions v2.7.x,v2.8.x,v2.9.x,v3.x,v4.x,v5.x

Locations for all the Service Logs in IWX DataFoundry

Starting from IWX DataFoundry v2.7.x, there are some new IWX services added.

Platform

Configuration service

Notification Service

Ingestion Service (Only for DataFoundry on Databricks v3.x,v4.x)

Service logs	Location
Service info for UI	$IW_HOME/logs//apricot/apricot-out.log
Service info for Governor	$IW_HOME/logs//governor/governor.log
Service info for Hangman	$IW_HOME/logs//hangman/hangman.log
Service info for Rest API	$IW_HOME/logs//rest-api/iw-rest-api.log
Service info for DT (Data Transformation)	$IW_HOME/logs/dt/interactive.out
Service info for Monitoring Service	$IW_HOME/logs/monitor/monitor.log
Service info for Postgres	$IW_HOME/logs/pgsql.log
Service info for Orchestrator Web Server	$IW_HOME/logs//orchestrator/orchestrator.log
Service info for Orchestrator Engine Web Server	$IW_HOME/orchestrator-engine/airflow-webserver.log
Service info for Orchestrator Engine Scheduler	$IW_HOME/orchestrator-engine/airflow-scheduler.log
Service info for Orchestrator Engine Worker	$IW_HOME/orchestrator-engine/airflow-worker.log
Service info for RabbitMQ	$IW_HOME/logs/rabbit*.log
Service info for Nginx	$IW_HOME/resources/nginx-portable/logs/*.log
Service info for MongoDB	$IW_HOME/logs/mongod.log
Service info for Platform Services	$IW_HOME/logs/platform/platform-server.log
Service info for Notification User Consumer	$IW_HOME/logs/platform/artifact-consumer.log
Service info for Notification Artifact Consumer	$IW_HOME/logs/platform/user-consumer.log
Service info for Configuration Service	$IW_HOME/logs/platform/config-service.log
Service info for Ingestion Service (Only for DataFoundry on Databricks v3.x,v4.x)	$IW_HOME/logs/ingestion/ingestion.log

How to Collect/Analyze the Infoworks DataFoundry Export Job logs

Export to a delimited file

For the export jobs, one can divide the job into two stages and the exact cause for failure can usually be located based on the stage.

Pre-MR job stage: If there is no MR job launched by DF, as shown in the below figure, then this means the job failed in this stage.

In this case, one can download the logs and search in job_<jobid>.log file for the messages with the tag [ERROR] to look at the exact cause of failure.
In the cases when job_<jobid>.log is empty or is uninformative, one can search job_<jobid>.out file error messages or exceptions.

MR-Job submission and Post Submission stage: If there are MR jobs launched by DF, as shown in the below figure, then this means the job failed in this stage.

In this case, one can reach out to the Hadoop admin to download the MR job logs for the job id shown in the Identifier column of the above image. You can also get the corresponding map-reduce application job log id from the job<id>.log file and search for the below message in the log.

The URL to track the job: http://cs-internal-headnode.c.gcp-cs-shared-resources.internal:8088/proxy/application_1603205573328_0076/

After this, one has to look for messages with the tag ERROR along with steps mentioned in stage 1 to look at the exact cause for failure.
Once the error message is found, one can search for the keywords in the Infoworks Freshdesk Knowledge base to see if they are addressed before.

Export to Bigquery: Refer to the below doc to collect Bigquery audit log,

https://support.infoworks.io/a/solutions/articles/14000100766

How to collect and analyze the TPT logs for the Teradata TPT ingestion jobs in DataFoundry

Here is a Knowledge Base article on how to collect the TPT logs for a Teradata ingestion job

https://infoworks.freshdesk.com/a/solutions/articles/14000105966

The Guide for Infoworks DataFoundry logs (GCP, HDI, CDH, AZURE, EMR) Print

Ingestion

How to collect and analyze the Ingestion Job logs in Infoworks DataFoundry v2.x,v3.x (GCP,HDI,CDH,AZURE,EMR)

How to collect and analyze the Pipeline build Job logs in Infoworks DataFoundry v2.x,v3.x (GCP,HDI,CDH,AZURE,EMR)

Hive Execution Engine

Spark Execution Engine

Pipeline Interactive/sample data generation failure

Workflow Failure logs

Cube build jobs

Scheduled Jobs not getting triggered in Infoworks DataFoundry.

Infoworks DataFoundry Service logs

For Data Foundry Versions v2.4.x, v2.5.x, v2.6.x

Locations for all the Service Logs in IWX DataFoundry

For Data Foundry Versions v2.7.x,v2.8.x,v2.9.x,v3.x,v4.x,v5.x

Locations for all the Service Logs in IWX DataFoundry

How to Collect/Analyze the Infoworks DataFoundry Export Job logs

Export to a delimited file

How to collect and analyze the TPT logs for the Teradata TPT ingestion jobs in DataFoundry

Related Articles