Infoworks XML connector allows the user to load the XML from Local Machine, ADLS, S3, Google cloud storage, Dropbox, SFTP etc. To demonstrate, we should how to load the XML data from ADLS/S3.
1. JDBC connection string:
For ADLS, the jdbc connection string can look as the following:
jdbc:xml:DataModel=Relational;URI=abfss://
Here the DataModel parameter can take three values: Relational, FlattenedDocuments, Document. One can read more about each model here http://cdn.cdata.com/help/DVF/jdbc/RSBXML_p_DataModel.htm and can choose based on the use case. The URI is to get the location of the XML files. For ADLS, you use the format mentioned in the above example. If you want to load multiple XML files with in a sub folder then one can set the URI as URI=abfss://
For s3, the jdbc connection string can be set as the following:
jdbc:cdata:xml:DataModel=Relational;URI=s3://iw-support/tables/*.xml;AWSAccessKey=####;AWSSecretKey=$$$;AWS Region=xxx;
2. Username, Password and Source Schema:
One can the below the values for username, password and Source schema:
Username=*
Password=*
Source Schema=XML
3. Connection Parameters:
key=RowScanDepth
This determines the number of rows that will be scanned by Infoworks to determine the datatype of the columns. The default value is 100. If set to 0, it will scan all the data.
Key=TypeDetectionScheme
This determines how to set the datatypes of the columns. By default, it does the row scan. If set to "None", all the columns will be returned with String datatype.
Key=XPath
If you need specific table created from Infoworks metadata crawl the use set the XPath
For example:
<root> <child> <subchild>.....</subchild> <subchild2>.....</subchild2> <subchild3>.....</subchild3> </child> </root>
If you requrie three tables for nested documents subchild, subchild2, subchild3
You can set XPath=/root/child/subchild/;/root/child/subchild3/;/root/child/subchild3/;
Applicable versions: 4.x