Working with AzureSQL Database
Azure SQL is a family of managed, secure, and intelligent products that use the SQL Server database engine in the Azure cloud. Azure SQL is built upon the familiar SQL Server engine, so you can migrate applications with ease and continue to use the tools, languages, and resources you're familiar with.
#
Azure SQL as a SourceProperty | Description | Default Value | Required |
---|---|---|---|
Table + Filter | Specify the table from where to read data. The table can contain schema or even database name. If you select Table as input, data flow fetches all the data from the table specified in the dataset. Example: customerdb.[dbo].[customer] [customer] (default schema name dbo will be assumed) [dbo].[customer] (database name shall be as per Datastore config) Also when the Table option is selected, you can also specify the Filter which needs to be included when querying data from synapse. The filter can be used to select only a part of the table. We may basically enter a condition and only the part of the table that satisfies the condition will be seen in the Target Section. | None | Yes. |
SQL | Specify the SQL query which will be run as-is on the source. | ||
Clean Column Name | To specify whether to clean the column names of source data (both file headers and column names from table). All the characters other than alphanumeric will be converted to _. Ex. 'col.1' will become 'col_1', 'col@1' will become 'col_1' | False | No |
Configure Pre and Post SQL | Guzzle supports Pre-SQL and Post-SQL for source and target and their execution in ingestion. It is mainly used for pre and post formatting of data in database. For more information click here. | NULL | No |
Configure watermark | A watermark represents tracking the last loaded value for one or more columns for a given source table or source SQL to enable loading data incrementally. Using watermark columns is one of the mechanisms used for changed data capture (CDC). For more information click here. | NULL | No |
Configure columns restriction | Mention column names and choose from the options to exclude and include columns. For more information click here. | NULL | No |
Configure table dependency | NULL | No |
#
Interface for Azure Source#
Azure SQL as a Target or RejectProperty | Description | Default Value | Required |
---|---|---|---|
Table | This is used to specify the Target Table we would like to create based on our Source. | None | Yes |
Operation | Provides two options that determines whether the source content or records are expected to be appended or overwritten in order to disallow existing data- Append: To atomically add new data to an existing target table. Overwrite: To atomically replace all the data in a target table with source table. For more information click here. | Append | Yes |
Reliability Level | Allowed Values include BEST_EFFORT and NO_DUPLICATES. NO_DUPLICATES implements a reliable insert in executor restart scenarios. It also ensures higher reliability and availability of Data. BEST_EFFORT is a Reduplication mechanism. It seeks to minimize the number of duplicates in our Data. This means if there is the same piece of Data more than 1 time in a row BEST_EFFORT reduplicates these instances and will only retain one of these occurrences. However, it does guarantee an absence of duplicates. | BEST_EFFORT | Yes |
Isolation Level | Specifies the transaction locking behavior for the SQL source. The allowed values are: Recommitted, ReadUncommitted, RepeatableRead and Serializable. If not specified, the database's default isolation level is used. Refer to this doc for more details. | READ_COMMITTED | Yes |
Batch Size | We can use this to specify how many rows are being written in each batch. Larger batch sizes may improve memory optimization. Allowed values are Integers. | 2500 | Yes |
Bulk Copy Timeout | This property specifies the wait time for the insert operation for each batch to complete before time runs out. | 600 | Yes |