Problem description:

Issue while parsing the datafiles in Macy's Cobol copybook ingestion and ingestion is failing with the error:

A mismatch between the number of columns in the specified schema and the record. Record has total 5 which is less than the specified 957 columns.

Because parsing is failing all the records are going to error records and hence exceeding our error threshold.


Solution:

This occurs when there is a mismatch between the layout file and datafile that are being ingested.

Use the below jar to determine if there is any difference between layout file record length and datafile record length.

Execute the MainFrameReaderNew-1.0-SNAPSHOT.jar  attached in the KB as below:


java -cp MainFrameReaderNew-1.0-SNAPSHOT.jar MainFrameReader -l <copybook layout file path> -d <data file path corresponding to layout> -o <output file path>


The options available for the parser are : 


-b,--bytesToIgnore  ------->  Number of bytes to be ignored from each line in the datafile

 -d,--datafile <arg>   -------->  The path where the datafile is stored

 -e,--encoding <arg>    ------->  The encoding of the layout file(copybook file)

 -f,--fileorg <arg>     ------->  This defines the type of file being used

 -fd,--fileDialect <arg>    -------> The dialect being used. Default it is mainframe dialect

 -l,--layoutfile <arg>     ------->  The path where the copybook file is stored

 -o,--outputfile <arg>    ------->  The path to the file where the output from the parser is stored

 -r,--recordsToParse <arg>   ------->  Number of records to be parsed from the datafile


The required options to run it are l, d and o. All others are optional.


For fileorg option the possible values are : 


int IO_FIXED_BYTE_ENTER_FONT = 35;  <default value which is used since we have fixed length binary data>

int IO_FIXED_CHAR_ENTER_FONT = 36;

int IO_TEXT_BYTE_ENTER_FONT = 37;

int IO_TEXT_CHAR_ENTER_FONT = 38;


The output file will have the actual output if everything is correct else there will be "error while parsing" kind of messages in the output file.


PFA jar file:

https://mail.google.com/mail/u/0/#search/manivas%40infoworks.io/FMfcgxwCgLthCbzpLkHfkRfPqMWSLBKg?projector=1