Upload the jar which has custom logic to a folder in the Infoworks edge node. Jars related to Hadoop, spark, etc, will already be available on the Infoworks DF classpath and hence, not required to be placed in this folder.

Registering custom UDF:

For registering, in Infoworks UI navigate to Admin > External Script > Pipeline Extensions.

Click on Add An Extension

Input the required fields

Extension Type: Choose Custom UDF

Name: Name of the jar file.

Folder Path: Absolute path to the folder where the jar has been uploaded.

Alias: User-friendly name. This we will use in the transformation node.

Class Name: Fully qualified class name.

Adding custom UDF to Domain:

Navigate to Admin > Domains. Select the required domain.

Click the Manage Artifacts button for the required domain.

Click the Add Pipeline Extensions button in the Accessible Pipeline Extensions section.

Select the Custom UDF to be made accessible in the domain.

Using Custom UDF:

Navigate to the pipeline page and select the required pipeline.

Click on any node with a Derivations page.

In the Derivations page, click on Add Derivation.

In the Expression text box, use the Alias of the custom UDF along with any parameters.

Click Save.

Reference docs:

To create a sample Java UDF with hive: https://docs.microsoft.com/en-us/azure/hdinsight/hadoop/apache-hadoop-hive-java-udf

Infoworks doc: https://docs2x.infoworks.io/data-transformation/custom-udf


Attached an example jar that converts the data to lowercase in the hive.