BigQuery Data Loading Behavior | Stitch Documentation

Primary Keys

A table without a Primary Keys is replicated.

IF	A table without a Primary Keys is replicated.
THEN	Table is created. BigQuery doesn’t use Primary Keys.

A table with a single Primary Key is replicated.

IF	A table with a single Primary Key is replicated.
THEN	Table is created. BigQuery doesn’t use Primary Keys.

A table with multiple Primary Keys is replicated.

IF	A table with multiple Primary Keys is replicated.
THEN	Table is created. BigQuery doesn’t use Primary Keys.

The table's Primary Key(s) is/are changed.

IF	The table’s Primary Key(s) is/are changed.
THEN	This scenario is unremarkable to BigQuery, as BigQuery doesn’t use Primary Keys.

The table's Primary Key comment is either removed or changed to be incorrectly formatted.

IF	The table’s Primary Key comment is either removed or changed to be incorrectly formatted.
THEN	No effect. Data will continue to load into the table.

Replication Keys

A table is replicated where the Replication Key column contains NULL values.

IF	A table is replicated where the Replication Key column contains `NULL` values.
THEN	During the initial job, the table will be created and all rows will be replicated. During subsequent jobs, only rows with populated Replication Keys will be replicated and persisted to BigQuery.

Tables

A table is replicated where the Replication Key column contains NULL values.

IF	A table is replicated where the Replication Key column contains `NULL` values.
THEN	During the initial job, the table will be created and all rows will be replicated. During subsequent jobs, only rows with populated Replication Keys will be replicated and persisted to BigQuery.

A table name contains more characters than allowed by BigQuery.

IF	A table name contains more characters than allowed by BigQuery.
THEN	BigQuery will reject all data for the table. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `Table name [TABLE] is too long for BigQuery`
FIX IT	If possible, change the table name in the source to be less than BigQuery’s character limit of 1,024 characters characters. Use the `_sdc_rejected` table to identify the root of the issue.

A table arrives with more columns than BigQuery allows.

IF	A table arrives with more columns than BigQuery allows.
THEN	BigQuery will reject all data for the table. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `ERROR: too many columns`
FIX IT	If possible, deselect some columns to allow Stitch to load data into BigQuery for the table. BigQuery has a limit of 10,000 columns columns per table. Use the `_sdc_rejected` table to identify the root of the issue.

A column contains only NULL values.

A column contains only NULL values.

THEN

Only columns that are populated in at least one record will be created in BigQuery.

For example: On the left is a table as it exists in the source; on the right is the table in BigQuery as replicated by Stitch:

SOURCE			FIRST JOB
id	updated_at [rep-key]	confirmed	id	updated_at [rep-key]
1	2017-06-01		1	2017-06-01

Prior to the next job, a new record is added to the table with a populated value in the confirmed column. Because the confirmed column contains at least one value, Stitch will create this column in BigQuery:

SOURCE			SECOND JOB
id	updated_at [rep-key]	confirmed	id	updated_at [rep-key]	confirmed
1	2017-06-01		1	2017-06-01
2	2017-06-02	true	2	2017-06-02	true

Note In this example, confirmed is NULL for record 1 in BigQuery. If record 1’s confirmed value is ever updated, its Replication Key value must also change to ensure Stitch detects the changes.

A table is tracked that contains entirely NULL columns.

IF	A table is tracked that contains entirely `NULL` columns.
THEN	No table is created in BigQuery.

A new column is added in a SaaS integration/data source to a table that's currently set to replicate.

A new column is added in a SaaS integration/data source to a table that’s currently set to replicate.

THEN

Stitch will detect the new column during the integration’s next structure sync.
The column will be automatically set to replicate..
In the destination, the column will be appended to the end of the table.

If the table uses Key-based Incremental or Log-based Incremental Replication, only records with a Replication Key value greater than or equal to the last saved bookmark value will be selected for replication.

This means that if values for the new column were backfilled in existing records, the only way to have Stitch replicate these values is to either:

Update the Replication Key values for all records in the source, OR
Reset the integration’s Replication Keys in Stitch

Note: Records already in the destination will not be updated with backfilled values due to the Append-Only nature of BigQuery. In this case, records will be appended to the end of the table as a new row.

For example - these rows are replicated during the initial job:

| updated_at [rep key] | name      | age | type     |
|----------------------+-----------+-----+----------|
| 2017-08-11 00:00:00  | Finn      | 14  | Human    |
| 2017-08-11 00:00:00  | Jake      | 6   | Dog      |

Before the next job, a new column named magic is added in the source. Values for these records are backfilled in the source and as a result, their Replication Key values are updated. After the next job, the table in the destination would look like the following:

| updated_at [rep key] | name      | age | type     | magic |
|----------------------+-----------+-----+----------+-------|
| 2017-08-11 00:00:00  | Finn      | 14  | Human    |       |
| 2017-08-11 00:00:00  | Jake      | 6   | Dog      |       |
| 2017-08-13 00:13:00  | Finn      | 14  | Human    | false |
| 2017-08-13 00:13:00  | Jake      | 6   | Dog      | true  |

Querying Append-Only tables requires a different strategy than you would normally use. For more info, check out the Querying Append-Only Tables Guide.

A new column is added by you to a Stitch-generated table in BigQuery.

IF	A new column is added by you to a Stitch-generated table in BigQuery.
THEN	Columns may be added to tables created by Stitch as long as they are nullable, meaning columns don’t have `NOT NULL` constraints.

A column is deleted at the source.

THEN

How a deleted column is reflected in BigQuery depends on the table’s Replication Method.

Consider the example below. On the left is a table as it exists in the source; on the right is the table in BigQuery:

SOURCE			DESTINATION
id	updated_at [rep-key]	confirmed	id	updated_at [rep-key]	confirmed
1	2017-06-03	true	1	2017-06-03	true
2	2017-06-03	false	2	2017-06-03	false

Before the next job, the confirmed column will be deleted in the source. Additionally, new records will be added to the table.

Log-based Incremental Replication

Due to how binary logs are structured, changes to a source table - including adding or removing columns, changing data types, etc. - require manual intervention before replication can continue. Refer to the Log-based Incremental Replication documentation for more info.

Key-based Incremental Replication

For tables using Key-based Incremental Replication, default NULLs will be placed into the column going forward and the column will remain in the destination.

Below is the source table with the deleted confirmed column and how this would be reflected in BigQuery:

SOURCE		DESTINATION
id	updated_at [rep-key]	id	updated_at [rep-key]	confirmed
1	2017-06-03	1	2017-06-03	true
2	2017-06-03	2	2017-06-03	false
3	2017-06-04	3	2017-06-04
4	2017-06-04	4	2017-06-04

Full Table Replication

For tables using Full Table Replication: during the next job, the entire table will be overwritten. The table’s schema will reflect the column’s removal in the source:

SOURCE		DESTINATION
id	updated_at [rep-key]	id	updated_at [rep-key]
1	2017-06-03	1	2017-06-03
2	2017-06-03	2	2017-06-03
3	2017-06-04	3	2017-06-04
4	2017-06-04	4	2017-06-04

You remove a column from a Stitch-replicated table in your destination.

THEN

The result of deleting a column from a Stitch-generated table depends on the type of column being removed:

Primary Key columns: Removing Primary Key columns will result in data being added to the table in an Append-Only fashion. As Stitch uses Primary Keys to de-dupe data, deleting Primary Key columns may result in duplicate data.
General columns: This refers to all columns that are not prepended by _sdc or suffixed by a data type. For example: customer_zip, but not customer_zip__st

If new data is detected for the removed column, Stitch will re-create it in BigQuery.

Note: An integration must support tracking individual columns AND you must deselect the column in Stitch for the column removal to be permanent.
_sdc columns: Removing a Stitch replication column will prevent Stitch from upserting replicated data into BigQuery.
Columns with data type suffixes: Removing a column created as result of accommodating multiple data types will prevent Stitch from loading replicated data into the table. This applies to columns with names such as: customer_zip__st, customer_zip__int, etc.

Columns

A table is replicated where the Replication Key column contains NULL values.

IF	A table is replicated where the Replication Key column contains `NULL` values.
THEN	During the initial job, the table will be created and all rows will be replicated. During subsequent jobs, only rows with populated Replication Keys will be replicated and persisted to BigQuery.

Stitch detects multiple data types for a single column.

THEN

For example: Stitch first detected that order_confirmed contained BOOLEAN data, but during a subsequent job, detected STRING values. To accommodate data of varying types, Stitch will:

Store data for the original data type in the original column. In this example, only BOOLEAN values will be stored in order_confirmed.
Create additional columns to store the other data types - one for each data type detected - and append the data type to the column name. In this example, a order_confirmed__st column will be created to store STRING values.

A VARCHAR column is replicated to BigQuery.

IF	A `VARCHAR` column is replicated to BigQuery.
THEN	BigQuery will store all `VARCHAR` data as `TEXT`.

VARCHAR data is loaded that exceeds the current maximum size for the column.

IF	`VARCHAR` data is loaded that exceeds the current maximum size for the column.
THEN	No widening will occur. BigQuery will store all `VARCHAR` data as `TEXT`.

A column containing date data with timezone info is replicated to BigQuery.

IF	A column containing date data with timezone info is replicated to BigQuery.
THEN	BigQuery has no support for timezones.

A column contains timestamp data that is outside BigQuery's supported range.

IF	A column contains timestamp data that is outside BigQuery’s supported range.
THEN	BigQuery will reject the records that fall outside the supported range. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `timestamp out of range for BigQuery on [TIMESTAMP]`
FIX IT	To resolve the error, offending values in the source must be changed to be within BigQuery’s timestamp range of 0001-01-01 00:00:00 to 9999-12-31 23:59:59.999999 UTC. Use the `_sdc_rejected` table to identify the root of the issue.

A column contains integer data that is outside BigQuery's supported range.

IF	A column contains integer data that is outside BigQuery’s supported range.
THEN	BigQuery will reject all the records in the entire batch. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `integer out of range for BigQuery on [INTEGER]`
FIX IT	To resolve the error, offending values in the source must be changed to be within BigQuery’s range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. Use the `_sdc_rejected` table to identify the root of the issue.

A column contains decimal data.

IF	A column contains decimal data.
THEN	Decimal values will be loaded to BigQuery as the data type `NUMERIC`.

A column contains decimal data that is outside BigQuery's supported range.

IF	A column contains decimal data that is outside BigQuery’s supported range.
THEN	BigQuery will accept the maximum supported range for the `NUMERIC` data type. Decimals that exceed this range will be logged in the rejected records table.

A table arrives with more columns than BigQuery allows.

IF	A table arrives with more columns than BigQuery allows.
THEN	BigQuery will reject all data for the table. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `ERROR: too many columns`
FIX IT	If possible, deselect some columns to allow Stitch to load data into BigQuery for the table. BigQuery has a limit of 10,000 columns columns per table. Use the `_sdc_rejected` table to identify the root of the issue.

A column contains only NULL values.

A column contains only NULL values.

THEN

Only columns that are populated in at least one record will be created in BigQuery.

For example: On the left is a table as it exists in the source; on the right is the table in BigQuery as replicated by Stitch:

SOURCE			FIRST JOB
id	updated_at [rep-key]	confirmed	id	updated_at [rep-key]
1	2017-06-01		1	2017-06-01

SOURCE			SECOND JOB
id	updated_at [rep-key]	confirmed	id	updated_at [rep-key]	confirmed
1	2017-06-01		1	2017-06-01
2	2017-06-02	true	2	2017-06-02	true

A table is tracked that contains entirely NULL columns.

IF	A table is tracked that contains entirely `NULL` columns.
THEN	No table is created in BigQuery.

A new column is added in a SaaS integration/data source to a table that's currently set to replicate.

A new column is added in a SaaS integration/data source to a table that’s currently set to replicate.

THEN

Stitch will detect the new column during the integration’s next structure sync.
The column will be automatically set to replicate..
In the destination, the column will be appended to the end of the table.

This means that if values for the new column were backfilled in existing records, the only way to have Stitch replicate these values is to either:

Update the Replication Key values for all records in the source, OR
Reset the integration’s Replication Keys in Stitch

For example - these rows are replicated during the initial job:

| updated_at [rep key] | name      | age | type     |
|----------------------+-----------+-----+----------|
| 2017-08-11 00:00:00  | Finn      | 14  | Human    |
| 2017-08-11 00:00:00  | Jake      | 6   | Dog      |

| updated_at [rep key] | name      | age | type     | magic |
|----------------------+-----------+-----+----------+-------|
| 2017-08-11 00:00:00  | Finn      | 14  | Human    |       |
| 2017-08-11 00:00:00  | Jake      | 6   | Dog      |       |
| 2017-08-13 00:13:00  | Finn      | 14  | Human    | false |
| 2017-08-13 00:13:00  | Jake      | 6   | Dog      | true  |

Querying Append-Only tables requires a different strategy than you would normally use. For more info, check out the Querying Append-Only Tables Guide.

A new column is added by you to a Stitch-generated table in BigQuery.

IF	A new column is added by you to a Stitch-generated table in BigQuery.
THEN	Columns may be added to tables created by Stitch as long as they are nullable, meaning columns don’t have `NOT NULL` constraints.

A column is deleted at the source.

THEN

How a deleted column is reflected in BigQuery depends on the table’s Replication Method.

Consider the example below. On the left is a table as it exists in the source; on the right is the table in BigQuery:

SOURCE			DESTINATION
id	updated_at [rep-key]	confirmed	id	updated_at [rep-key]	confirmed
1	2017-06-03	true	1	2017-06-03	true
2	2017-06-03	false	2	2017-06-03	false

Before the next job, the confirmed column will be deleted in the source. Additionally, new records will be added to the table.

Log-based Incremental Replication

Key-based Incremental Replication

For tables using Key-based Incremental Replication, default NULLs will be placed into the column going forward and the column will remain in the destination.

Below is the source table with the deleted confirmed column and how this would be reflected in BigQuery:

SOURCE		DESTINATION
id	updated_at [rep-key]	id	updated_at [rep-key]	confirmed
1	2017-06-03	1	2017-06-03	true
2	2017-06-03	2	2017-06-03	false
3	2017-06-04	3	2017-06-04
4	2017-06-04	4	2017-06-04

Full Table Replication

For tables using Full Table Replication: during the next job, the entire table will be overwritten. The table’s schema will reflect the column’s removal in the source:

SOURCE		DESTINATION
id	updated_at [rep-key]	id	updated_at [rep-key]
1	2017-06-03	1	2017-06-03
2	2017-06-03	2	2017-06-03
3	2017-06-04	3	2017-06-04
4	2017-06-04	4	2017-06-04

You remove a column from a Stitch-replicated table in your destination.

THEN

The result of deleting a column from a Stitch-generated table depends on the type of column being removed:

Primary Key columns: Removing Primary Key columns will result in data being added to the table in an Append-Only fashion. As Stitch uses Primary Keys to de-dupe data, deleting Primary Key columns may result in duplicate data.
General columns: This refers to all columns that are not prepended by _sdc or suffixed by a data type. For example: customer_zip, but not customer_zip__st

If new data is detected for the removed column, Stitch will re-create it in BigQuery.

Note: An integration must support tracking individual columns AND you must deselect the column in Stitch for the column removal to be permanent.
_sdc columns: Removing a Stitch replication column will prevent Stitch from upserting replicated data into BigQuery.
Columns with data type suffixes: Removing a column created as result of accommodating multiple data types will prevent Stitch from loading replicated data into the table. This applies to columns with names such as: customer_zip__st, customer_zip__int, etc.

Two columns are replicated that canonicalize to the same name.

IF	Two columns are replicated that canonicalize to the same name.
THEN	For example: a table containing both `CustomerId` and `customerid` columns. BigQuery will reject the records and create a log for the rejected records in the `_sdc_rejected` table in that integration’s schema. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `Field collision on [COLUMN_NAME]`
FIX IT	If possible, re-name one of the columns in the source so that both column names will be unique when replicated to BigQuery. Use the `_sdc_rejected` table to identify the root of the issue.

A column is replicated that has a mixed-case name.

THEN

BigQuery will convert letters to lowercase. For example:

Columns in Source	Columns in BigQuery
`CuStOmErId`	`customerid`
`customerID`	`customerid`

A column is replicated that has a name with spaces.

THEN

BigQuery will convert spaces to undersocres. For example:

Columns in Source	Columns in BigQuery
`customer id`	`customer_id`
`CUSTOMER ID`	`customer_id`

A column is replicated with a name that contains unsupported special characters.

THEN

BigQuery will convert special characters to underscores. For example:

Columns in Source		Columns in BigQuery
`customer!id`		`customer_id`
`!CUSTOMERID`		`_customerid`

A column is replicated with a name that begins with a non-letter.

THEN

BigQuery will remove all leading non-letter characters with the exception of leading underscores. For example:

Columns in Source		Columns in BigQuery
`123customerid`		`_customerid`
`_customerid`		`_customerid`
`_987CUSTOMERID`		`_987CUSTOMERID`

A column name contains more characters than allowed by BigQuery.

IF	A column name contains more characters than allowed by BigQuery.
THEN	BigQuery will reject columns with names that exceed the column character limit. Other columns in the table will persist to BigQuery. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `Column name [COLUMN] is too long for BigQuery`
FIX IT	If possible, change the column name in the source to be less than BigQuery’s character limit of 128 characters characters. Use the `_sdc_rejected` table to identify the root of the issue.

Data Typing

Stitch detects multiple data types for a single column.

THEN

For example: Stitch first detected that order_confirmed contained BOOLEAN data, but during a subsequent job, detected STRING values. To accommodate data of varying types, Stitch will:

Store data for the original data type in the original column. In this example, only BOOLEAN values will be stored in order_confirmed.
Create additional columns to store the other data types - one for each data type detected - and append the data type to the column name. In this example, a order_confirmed__st column will be created to store STRING values.

A VARCHAR column is replicated to BigQuery.

IF	A `VARCHAR` column is replicated to BigQuery.
THEN	BigQuery will store all `VARCHAR` data as `TEXT`.

VARCHAR data is loaded that exceeds the current maximum size for the column.

IF	`VARCHAR` data is loaded that exceeds the current maximum size for the column.
THEN	No widening will occur. BigQuery will store all `VARCHAR` data as `TEXT`.

A column containing date data with timezone info is replicated to BigQuery.

IF	A column containing date data with timezone info is replicated to BigQuery.
THEN	BigQuery has no support for timezones.

A column contains timestamp data that is outside BigQuery's supported range.

IF	A column contains timestamp data that is outside BigQuery’s supported range.
THEN	BigQuery will reject the records that fall outside the supported range. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `timestamp out of range for BigQuery on [TIMESTAMP]`
FIX IT	To resolve the error, offending values in the source must be changed to be within BigQuery’s timestamp range of 0001-01-01 00:00:00 to 9999-12-31 23:59:59.999999 UTC. Use the `_sdc_rejected` table to identify the root of the issue.

A column contains integer data that is outside BigQuery's supported range.

IF	A column contains integer data that is outside BigQuery’s supported range.
THEN	BigQuery will reject all the records in the entire batch. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `integer out of range for BigQuery on [INTEGER]`
FIX IT	To resolve the error, offending values in the source must be changed to be within BigQuery’s range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. Use the `_sdc_rejected` table to identify the root of the issue.

A column contains decimal data.

IF	A column contains decimal data.
THEN	Decimal values will be loaded to BigQuery as the data type `NUMERIC`.

A column contains decimal data that is outside BigQuery's supported range.

IF	A column contains decimal data that is outside BigQuery’s supported range.
THEN	BigQuery will accept the maximum supported range for the `NUMERIC` data type. Decimals that exceed this range will be logged in the rejected records table.

Schema Changes

Stitch detects multiple data types for a single column.

THEN

For example: Stitch first detected that order_confirmed contained BOOLEAN data, but during a subsequent job, detected STRING values. To accommodate data of varying types, Stitch will:

Store data for the original data type in the original column. In this example, only BOOLEAN values will be stored in order_confirmed.
Create additional columns to store the other data types - one for each data type detected - and append the data type to the column name. In this example, a order_confirmed__st column will be created to store STRING values.

Data is replicated to BigQuery that is nested, containing many top-level properties and potentially nested sub-properties.

IF	Data is replicated to BigQuery that is nested, containing many top-level properties and potentially nested sub-properties.
THEN	Nested data will be maintained. If the data contains nested arrays, BigQuery will reject the data. Rejected records are logged in the `_sdc_rejected` table of the integration's schema. Learn more.
AND	BigQuery will surface the following error: `JSON table encountered too many errors, giving up. Rows: 1; errors: 1. JSON parsing error in row starting at position 0: . Nested arrays not allowed.`
FIX IT	If possible, remove the offending record from the source. Use the `_sdc_rejected` table to identify the root of the issue.

VARCHAR data is loaded that exceeds the current maximum size for the column.

IF	`VARCHAR` data is loaded that exceeds the current maximum size for the column.
THEN	No widening will occur. BigQuery will store all `VARCHAR` data as `TEXT`.

A new column is added in a SaaS integration/data source to a table that's currently set to replicate.

A new column is added in a SaaS integration/data source to a table that’s currently set to replicate.

THEN

Stitch will detect the new column during the integration’s next structure sync.
The column will be automatically set to replicate..
In the destination, the column will be appended to the end of the table.

This means that if values for the new column were backfilled in existing records, the only way to have Stitch replicate these values is to either:

Update the Replication Key values for all records in the source, OR
Reset the integration’s Replication Keys in Stitch

For example - these rows are replicated during the initial job:

| updated_at [rep key] | name      | age | type     |
|----------------------+-----------+-----+----------|
| 2017-08-11 00:00:00  | Finn      | 14  | Human    |
| 2017-08-11 00:00:00  | Jake      | 6   | Dog      |

| updated_at [rep key] | name      | age | type     | magic |
|----------------------+-----------+-----+----------+-------|
| 2017-08-11 00:00:00  | Finn      | 14  | Human    |       |
| 2017-08-11 00:00:00  | Jake      | 6   | Dog      |       |
| 2017-08-13 00:13:00  | Finn      | 14  | Human    | false |
| 2017-08-13 00:13:00  | Jake      | 6   | Dog      | true  |

Querying Append-Only tables requires a different strategy than you would normally use. For more info, check out the Querying Append-Only Tables Guide.

A new column is added by you to a Stitch-generated table in BigQuery.

IF	A new column is added by you to a Stitch-generated table in BigQuery.
THEN	Columns may be added to tables created by Stitch as long as they are nullable, meaning columns don’t have `NOT NULL` constraints.

A column is deleted at the source.

THEN

How a deleted column is reflected in BigQuery depends on the table’s Replication Method.

Consider the example below. On the left is a table as it exists in the source; on the right is the table in BigQuery:

SOURCE			DESTINATION
id	updated_at [rep-key]	confirmed	id	updated_at [rep-key]	confirmed
1	2017-06-03	true	1	2017-06-03	true
2	2017-06-03	false	2	2017-06-03	false

Before the next job, the confirmed column will be deleted in the source. Additionally, new records will be added to the table.

Log-based Incremental Replication

Key-based Incremental Replication

For tables using Key-based Incremental Replication, default NULLs will be placed into the column going forward and the column will remain in the destination.

Below is the source table with the deleted confirmed column and how this would be reflected in BigQuery:

SOURCE		DESTINATION
id	updated_at [rep-key]	id	updated_at [rep-key]	confirmed
1	2017-06-03	1	2017-06-03	true
2	2017-06-03	2	2017-06-03	false
3	2017-06-04	3	2017-06-04
4	2017-06-04	4	2017-06-04

Full Table Replication

For tables using Full Table Replication: during the next job, the entire table will be overwritten. The table’s schema will reflect the column’s removal in the source:

SOURCE		DESTINATION
id	updated_at [rep-key]	id	updated_at [rep-key]
1	2017-06-03	1	2017-06-03
2	2017-06-03	2	2017-06-03
3	2017-06-04	3	2017-06-04
4	2017-06-04	4	2017-06-04

You remove a column from a Stitch-replicated table in your destination.

THEN

The result of deleting a column from a Stitch-generated table depends on the type of column being removed:

Primary Key columns: Removing Primary Key columns will result in data being added to the table in an Append-Only fashion. As Stitch uses Primary Keys to de-dupe data, deleting Primary Key columns may result in duplicate data.
General columns: This refers to all columns that are not prepended by _sdc or suffixed by a data type. For example: customer_zip, but not customer_zip__st

If new data is detected for the removed column, Stitch will re-create it in BigQuery.

Note: An integration must support tracking individual columns AND you must deselect the column in Stitch for the column removal to be permanent.
_sdc columns: Removing a Stitch replication column will prevent Stitch from upserting replicated data into BigQuery.
Columns with data type suffixes: Removing a column created as result of accommodating multiple data types will prevent Stitch from loading replicated data into the table. This applies to columns with names such as: customer_zip__st, customer_zip__int, etc.

Data Warehouse Changes

The table's Primary Key(s) is/are changed.

IF	The table’s Primary Key(s) is/are changed.
THEN	This scenario is unremarkable to BigQuery, as BigQuery doesn’t use Primary Keys.

The table's Primary Key comment is either removed or changed to be incorrectly formatted.

IF	The table’s Primary Key comment is either removed or changed to be incorrectly formatted.
THEN	No effect. Data will continue to load into the table.

A new column is added by you to a Stitch-generated table in BigQuery.

IF	A new column is added by you to a Stitch-generated table in BigQuery.
THEN	Columns may be added to tables created by Stitch as long as they are nullable, meaning columns don’t have `NOT NULL` constraints.

You remove a column from a Stitch-replicated table in your destination.

THEN

The result of deleting a column from a Stitch-generated table depends on the type of column being removed:

Primary Key columns: Removing Primary Key columns will result in data being added to the table in an Append-Only fashion. As Stitch uses Primary Keys to de-dupe data, deleting Primary Key columns may result in duplicate data.
General columns: This refers to all columns that are not prepended by _sdc or suffixed by a data type. For example: customer_zip, but not customer_zip__st

If new data is detected for the removed column, Stitch will re-create it in BigQuery.

Note: An integration must support tracking individual columns AND you must deselect the column in Stitch for the column removal to be permanent.
_sdc columns: Removing a Stitch replication column will prevent Stitch from upserting replicated data into BigQuery.
Columns with data type suffixes: Removing a column created as result of accommodating multiple data types will prevent Stitch from loading replicated data into the table. This applies to columns with names such as: customer_zip__st, customer_zip__int, etc.

Indices are applied to Stitch-generated columns in the destination.

IF	Indices are applied to Stitch-generated columns in the destination.
THEN	Stitch will respect the index application.

SORT and/or DISTRIBUTION Keys are applied to Stitch-generated columns in the destination.

IF	SORT and/or DISTRIBUTION Keys are applied to Stitch-generated columns in the destination.
THEN	This is not applicable to BigQuery.

You switch to a different destination of the same type.

THEN

This means the destination type is still BigQuery, Stitch may just be connected a different database in BigQuery.

For tables using Key-based or Log-based Incremental Replication, replication will continue using the Replication’s Key last saved maximum value. To re-replicate historical data, resetting Replication Keys is required.
For tables using Full Table Replication, the table will be fully replicated into the new destination during the next successful job.
For webhook integrations, some data loss may be possible due to the continuous, real-time nature of webhooks. Historical data must either be backfilled or re-played.

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.

Related	Troubleshooting
Loading Data into Your Destination Table Schema Changes Nested Structures & Row Count Impact Querying Append-Only Tables	Data Discrepancy Guide Identifying Rejected Records Destination Data Loading Errors Postgres Data Loading Errors Redshift Data Loading Errors