Target Audience: Hadoop Admins/Infoworks Admins


Infoworks Ingestion Jobs fails with org.apache.hadoop.util.DiskChecker$DiskErrorException exception


When an Infoworks User Configures the Sources, Table and table Group and triggers the job either from UI or Rest API, the job fails and the log files reflects the below or similar message.


[ERROR] 2018-09-06 02:13:55,473 [pool-5-thread-1] infoworks.tools.hadoop.mapreduce.IWJob:187 :: AttemptID:attempt_628437377773_8383_m_000223_0 Info:Error: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_628437377773_8383_m_000223_0
/file.out
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:441)


The reason behind this issue due to lack of Space in Local directory

This usually happens when there is not enough space on the local directories based on the requirement of data to be processed.

 

I suggest admins please take a look into settings in 
MR store intermediate data during map reduce job in the local directory. It will be in mapreduce.cluster.local.dir It will be inmapred-site.xml file when this issue occurs while using Infoworks. Admins can modify the value to meet the expectations as needed.

Also Supporting writeup can be available on the below blogs


https://community.pivotal.io/s/article/Map-Reduce-job-failed-with-Could-not-find-any-valid-local-directory-for-output-attempt-xxxx-xxxx-m-x-file-out

https://stackoverflow.com/questions/30372867/could-not-find-any-valid-local-directory-for-jobcache-exception-hadoop


Thanks,

Sri