Microsoft Azure SQL Data Warehouse Destination

Stitch only supports connecting to Azure SQL Data Warehouse instances

Stitch’s Azure SQL Data Warehouse destination only works with Microsoft’s Azure SQL Data Warehouse product.

Stitch doesn’t currently support using Azure SQL Server or Azure SQL Database as a destination.

Microsft Azure SQL Data Warehouse is a fast, fully-managed, petabyte-scale data warehouse. It’s ideal for batch-based data warehouse workloads, and designed with a decoupled storage and compute model that allows it to scale quickly and be maintained in a cost-effective way.

Pricing

Azure SQL Data Warehouse bases their pricing on your compute and storage usage. Compute usage is charged using an hourly rate, meaning you’ll only be billed for the hours your data warehouse is active. Compute usage is billed in one hour increments.

Storage charges include the size of your primary database and seven days of incremental snapshots. Microsoft Azure rounds charges to the nearest terabyte (TB). For example: If the data warehouse is 1.5 TB and you have 100 GB of snapshots, your bill will be for 2 TB of data.

Refer to Microsoft’s documentation for more info and examples.

Setup info

High-level details about Stitch’s implementation of Azure SQL Data Warehouse, such as supported connection methods, availability on Stitch plans, etc.

Release status	Released
Pricing tier	All Stitch plans
Supported versions	Not applicable Currently, Stitch supports all versions of Azure SQL Data Warehouse.
SSH connections	Supported Stitch supports using SSH tunnels to connect to Azure SQL Data Warehouse destinations. Refer to these instructions to set up an SSH tunnel for Azure SQL Data Warehouse.
SSL connections	Supported
VPN connections	Unsupported
Setup documentation	Connecting a Microsoft Azure SQL Data Warehouse Destination

Replication

Details about Stitch and replication for Azure SQL Data Warehouse destinations.

Incremental Replication - Upserts	Supported Using a table’s Primary Keys, Stitch will perform an update operation on existing rows to overwrite the data.
Primary Keys	Unsupported Azure SQL Data Warehouse destinations do not have native support for Primary Keys. To ensure existing rows can be updated in incrementally-replicated tables, Stitch will create an `_sdc_primary_keys` table in every integration schema. This table contains the Primary Key data for every table in the given schema, which Stitch will use to de-dupe rows during loading. Note: De-duplicating doesn’t apply to tables that are replicated in full. Refer to our Azure Primary Key handling documentation for more info.
Multipe data types	Unsupported Azure SQL Data Warehouse destinations do not support multiple data types in a single column. To accommodate this scenario, Stitch will create additional columns for each data type: The original column’s name will remain as-is. It will be typed according to the first data type Stitch detected for the column. Subsequent columns will have the data type appended to their name. Additionally, only data of this type will be stored in this column. For example: `customer_id__bi` would contain data determined to be a big integer. Note: If multiple data types are loaded during creation of the column, all columns will be created with the data type appended to their name.
Nested data structure support	Unsupported Azure SQL Data Warehouse destinations do not have native support for nested data structures. To ensure nested data can be loaded, Stitch will flatten objects and arrays into columns and subtables, respectively. For more info and examples, refer to the Handling nested data structures guide.
Incompatible sources	No reported incompatibilities No integration compatibility issues have been reported for Azure SQL Data Warehouse.

Loading process overview

After the data is extracted from your integrations, Stitch will perform the following steps to prepare and load that data into your Azure SQL Data Warehouse destination.

Step 1: Load data into Azure Blob Storage

The first step in the loading process for Azure SQL Data Warehouse destinations is to load the extracted data into Azure Blob Storage.

Blob storage is intended for storing massive amounts of unstructured data. In the next step, Stitch will use Polybase to retrieve the data from Blob Storage and prepare it for loading into Azure SQL Data Warehouse.

Step 2: Prep data using Polybase

Note: Polybase has its own set of limitations that may make it impossible to load certain data. Refer to the Limitations section for more info.

Polybase is a Microsoft offering that integrates Microsoft SQL products with Hadoop. Polybase is needed to query data from Azure Blob Storage.

In this step, Stitch will perform the following:

Create an external data source. This creates an external data source for the Polybase queries Stitch will run.
Create an external file format. This creates an object that defines the external (extracted) data Stitch will load. This is used in the next step to create an external table.
Create an external table. Using the external file format, this will create an external table. The external table is used to stage the data from Azure blob storage and load it into your Azure SQL Data Warehouse data warehouse.

Step 3: Insert data into the data warehouse

Lastly, Stitch will insert the data from the external table in Polybase into your Azure SQL Data Warehouse data warehouse.

Limitations

In this section:

Object name limits - Details about object naming rules imposed by Azure SQL Data Warehouse destinations.
Table limits - Details about table rules imposed by Azure SQL Data Warehouse destinations.
Data limits - Details about data limitations imposed by Azure SQL Data Warehouse destinations.
Column naming - Details about transformations performed by Stitch to ensure source column names adhere to the object naming rules imposed by Azure SQL Data Warehouse destinations.

Object name limits

Details about object naming rules imposed by Azure SQL Data Warehouse destinations.

Case sensitivity

Insensitive

This means that Azure SQL Data Warehouse destinations treat upper- and lowercase characters as the same. For example: CUSTOMERID and customerid are equal in Azure SQL Data Warehouse.

When column names are loaded into Azure SQL Data Warehouse, any uppercase characters will be forced into lowercase. This means that Id and id will canonicalize to the same name, which can lead to object collision errors. Refer to the Column naming section of this guide for examples.

Table name length

112 characters

Tables with names that exceed this limit will be rejected. Rejected tables are logged in the _sdc_rejected table.

Column name length

128 characters

Columns with names that exceed this limit will be rejected. Rejected columns are logged in the _sdc_rejected table.

Table limits

Details about table rules imposed by Azure SQL Data Warehouse destinations.

Max # of columns per table	1,024 columns
Max table size	60 TB compressed on disk
Max # of tables per database	10,000 tables

Data limits

Details about data limitations imposed by Azure SQL Data Warehouse destinations.

Max record size	1 MB This limit is imposed by PolyBase, which is required to load data into Azure SQL Data Warehouse destinations.
Boolean values	1, 0, NULL Azure SQL Data Warehouse will store data typed as `BOOLEAN` using only the above values. These columns will be `BIT` columns in the destination. For example: If a `BOOLEAN` source value is `true`, it will be stored as `1` in Azure SQL Data Warehouse.
Timezone support	Unsupported Azure SQL Data Warehouse will store the value in UTC with the specified offset. These columns are type as `DATETIMEOFFSET`. More info about timestamp data types can be found in Microsoft’s documentation.
Date ranges	January 1, 0001 AD, through December 31, 9999 AD Date values outside of this range will be rejected and logged in the `_sdc_rejected` table. More info about Azure SQL Data Warehouse date data types can be found in Microsoft’s documentation.
Timestamp ranges	00:00:00 through 23:59:59.997 Timestamp values outside of this range will be rejected and logged in the `_sdc_rejected` table. More info about Azure SQL Data Warehouse time data types can be found in Microsoft’s documentation.
Decimal precision	Must be between 1 and 38 (38 is the default) Decimal values outside of this range will be rejected and logged in the `_sdc_rejected` table.
Decimal scale	Must be between 0 and the precision value Decimal values outside of this range will be rejected and logged in the `_sdc_rejected` table.
VARCHAR max width	4,000 bytes String values outside of this range will be truncated to the maximum allowed width.

Column naming

Details about transformations performed by Stitch to ensure source column names adhere to the object naming rules imposed by Azure SQL Data Warehouse destinations.

	Source column name	Destination column name
Mixed case	CuStOmErID	customerid
Contains spaces	customer id	customerid
Contains unsupported characters	!customer# id	customerid
Name begins with an underscore	_customerid	customerid
Name begins with a non-letter	123customerid	customerid

Compare destinations

Not sure if Azure SQL Data Warehouse is the data warehouse for you? Check out the Choosing a Stitch Destination guide to compare each of Stitch’s destination offerings.

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.

Related	Troubleshooting
Choosing a Destination Destination & Integration Compatibility Loading Data into Your Destination Switching Destinations	Destination Connection Errors