Problem Description:


During the CSV file ingestion, if there as a string value 'NULL' in your data, it will be ingested as NULL data in the hive.


Root cause:

 Infoworks EDO2 (Infoworks Enterprise Data Operations and Orchestration) will consider the string value NULL in the CSV file as NULL data and will ingest the same in the hive unless we explicitly mention/configure that it has to be considered as a string value.


Solution:


To achieve this, set the below Advanced configuration at the table level and then initialize and ingest the table.


Key: CSV_NULL_STRING
Value: ****


The default value for the above configuration is NULL in Infoworks. So whenever Infoworks EDO2 encounters a string NULL it will replace it with NULL data. So the issue mentioned above is occurring because of this.


By adding the above-advanced configuration and setting its value to something which is not NULL. The string value NULL will be considered as a string and Infoworks EDO2 will ingest it as a string in the hive.





Note: Make sure that your data in the CSV file does not have **** in its data.


Applicable Infoworks EDO2 Versions.


v2.3.x,v2.4,x,v2.5.x,v2.6.x,v2.7.x,v2.8.x