Description
We can build a custom Target for pipelines in Infoworks on Databricks. Follow the below steps to build a custom target for SQL Server.
Usage:
Custom target extension jar : https://drive.google.com/open?id=13FKgb_qUfFOstAxHH-3s6NzKJbbZICYg
Custom target extension class name : io.infoworks.extensions.spark.custom.target.impl.SQLServerCustomTarget
Set following properties :
jdbc_url(Required) : jdbc url to access sql server
driver_name(Optional) : JDBC driver name. Default is: com.microsoft.sqlserver.jdbc.SQLServerDriver
schema(Required) : target schema name
table(Required) : target table name
sync_mode(Required): target table sync mode(overwrite, append,merge)
natural_columns(Optional) : Comma separated natural columns. Required only if sync mode us merge.
partition_column(Optional) : Single partition column name
partition_range_type(Optional) : partition range type : LEFT or RIGHT. Required only if partition_column is set
partition_range_values(Optional) : Comma separated partition range values. Required only if partition_column is set
indexing_type(Optional) : indexing type : CLUSTERED or NONCLUSTERED
index_columns(Optional) : Comma separated index columns. Required only if indexing_type is set
create_table_if_not_exists (Optional) : Determines if user can create table if table not exists. Default is true.
spark_write_options (Optional): Semicolon separated write options. For example : numPartition=10;batchsize=10000
You can add as many options you want.
numPartition determines the maximum number of concurrent JDBC connections while writing to the target table.
batchsize will determines how many rows to insert per round trip. Default is 1000.
Applicable Infoworks on Databricks versions
v4.x