Problem Description:


DFI Ingestion job fails with the below error in the job log.


Running SQL: select `ziw_filename` ,count(*) from `FLML_AUTH_GTM_V_STG_temp`  group by `ziw_filename`
[ERROR] 2019-10-15 21:49:31,923 [pool-2-thread-4] infoworks.tools.hive.HiveUtils:464 :: Error while trying to execute hive queries java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1571148567598_0068_1_00, diagnostics=[Task failed, taskId=task_1571148567598_0068_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.IllegalArgumentException: tez.runtime.io.sort.mb 6082 should be larger than 0 and should be less than the available task memory (MB):1010


Root cause:


tez.runtime.io.sort.mb value will be 40% of the value for hive.tez.container.size.

Go through the below Tez memory tuning guide.

https://community.cloudera.com/t5/Community-Articles/Demystify-Apache-Tez-Memory-Tuning-Step-by-Step/ta-p/245279

So if someone changes the hive.tez.container.size at the Ambari level, it will ask to provide the corresponding tez.runtime.io.sort.mb (40 % of it). If you do not update this value and click on Proceed Anyway, the tez.runtime.io.sort.mb will still point to the old value and that will cause this issue.

So the tez.runtime.io.sort.mb must be still set to 6082 (old value) at the Ambari level even after changing the hive.tez.container.size as you might have clicked on Proceed anyway while changing the container size.


Solution:


a) Make sure that you are not passing the hive.tez.container.size in /opt/infoworks/conf/conf.properties file.
b) Set tez.runtime.io.sort.mb in Ambari to 800 MB and then restart Tez and corresponding services.
c) Run the ingestion job.


Applicable Infoworks Versions:


IWX v2.4.x,2.5.x,2.6.x,2.7.x,2.8.x