Infoworks systems are configured to make a direct connection to a single Hiveserver host using port 10000, so the ingestion and pipeline jobs depend on this Hiveserver being up. In HDP clusters, there is an option to avail HA for hive through Zookeeper. In this setup, if one Hiveserver is down, Zookeeper will route to another one that is running. Infoworks supports Hive connection via Zookeeper
STEPS TO CONFIGURE INFOWORKS WITH ZOOKEEPER FOR HIVESERVER2:
We need to configure Infoworks to connect to Hive via Zookeeper URL instead of the Hiveserver2 URL
The Zookeeper URL will be in the format below:
NOTE: <zk0> through <<zk2> are placeholders to Zookeeper hosts
The url portion which is in bold is the Zookeeper quorum and the one which is not in bold is the extra information. These two need to be separated out for IW so that the URL gets created properly. These properties need to be set in the Admin page as below:
This will help IWX in getting the correct URL. Please see the screenshot below.
IWX versions : 2.4.x and 2.5.x