Google Analytics feature snapshot

A high-level look at Stitch's Google Analytics integration, including release status, useful links, and the features supported in Stitch.

STITCH
Release Status

Released

Supported By

Stitch

Stitch Plan

Free

DATA SELECTION
Table Selection

Unsupported

Column Selection

Unsupported

REPLICATION SETTINGS
Anchor Scheduling

Unsupported

Advanced Scheduling

Unsupported

Table-level Reset

Unsupported

Configurable Replication Methods

Unsupported

TRANSPARENCY
Extraction Logs

Unsupported

Loading Reports

Supported

Connecting Google Analytics

Connecting your Google Analytics data to Stitch is a five-step process:

  1. Add Google Analytics as a Stitch data source
  2. Define the Historical Sync
  3. Define the Replication Frequency
  4. Authorize Stitch & select a Google Analytics profile to track
  5. Select metrics and dimensions to sync

Prerequisites

Before you get started, you should verify that:

  1. The user creating the integration has at least Read & Analyze permissions and that there’s recent data in the account. If the profile you use to connect doesn’t have these permissions (or there’s no data in the account), you’ll receive an error message like this:

    “Something went wrong. None of the Google Analytics profiles associated with the credentials you’ve supplied contain data that Stitch can access. Please make sure that the credentials you’ve supplied have appropriate access.”

  2. All ad-blocking software you are currently using is paused. Because Google authentication uses pop ups, you may encounter issues if ad blockers aren’t disabled during the setup.

Add Google Analytics as a Stitch data source

  1. Sign into your Stitch account.
  2. On the Stitch Dashboard page, click the Add Integration button.

  3. Click the Google Analytics icon.

  4. Enter a name for the integration. This is the name that will display on the Stitch Dashboard for the integration; it’ll also be used to create the schema in your destination.

    For example, the name “Stitch Google Analytics” would create a schema called stitch_google_analytics in the destination. Note: Schema names cannot be changed after you save the integration.

Define the Historical Sync

The Sync Historical Data setting will define the starting date for your Google Analytics integration. This means that data equal to or newer than this date will be replicated to your data warehouse.

Change this setting if you want to replicate data beyond Google Analytics’s default setting of 30 days. For a detailed look at historical replication jobs, check out the Syncing Historical SaaS Data guide.

Create a replication schedule

In the Replication Frequency section, you’ll create the integration’s replication schedule. An integration’s replication schedule determines how often Stitch runs a replication job, and the time that job begins.

Google Analytics integrations support the following replication scheduling methods:

To keep your row usage low, consider setting the integration to replicate less frequently. See the Understanding and Reducing Your Row Usage guide for tips on reducing your usage.

Authorize Stitch & Select a Google Analytics Profile

  1. Next, you’ll be prompted to log into your Google account and to approve Stitch’s access to your Google Analytics data. Note that we will only ever read your data.
  2. Click Allow to continue.
  3. After your credentials are validated, you’ll be prompted to select the Google Analytics profile you want to connect to Stitch:

    Selecting a Google Analytics profile.

    Remember: profiles need to have Read & Analyze permissions to be detected by Stitch. If you don’t see the profile you want in this list, we recommend that you double-check the permission settings.

  4. When finished, click Continue.

Select Metrics & Dimensions to Sync

After you grant Stitch access to your Google Analytics profile, you can select the specific Metrics and Dimensions you want to replicate to your destination.

Before you get started, note the following:

  1. Metric/Dimension combos that adhere to Google’s compatibility rules can be saved. Stitch will display a notification if a conflict is found while adding Metrics and Dimensions. Integrations with incompatible Metric/Dimension combos can’t be saved.

    We recommend creating additional Google Analytics integrations for different reports if you run into compatibility issues.

  2. Metric/Dimension combos can’t be changed after the integration is saved. The Primary Key Stitch creates for Google Analytics integration tables is a composite key composed of the Dimensions selected during setup. Adding or removing Dimensions will change the Primary Key, potentially leading to issues with identifying new data for replication or de-duping data.

  3. Google limits the number of Metrics and Dimensions you can select. You can select up to 10 Metrics and seven Dimensions per integration. Refer to Google’s documentation for more info on these limits.

  4. Segments and Filters aren’t currently supported. If you’re interested in us adding these features, please get in touch with us.

Select Metrics & Dimensions

  1. In the Metrics and Dimensions fields, you can search or use the drop-down to explore your options:

    Selecting Metrics & Dimensions.

  2. Click the Metric or Dimension in the menu to add it to the configuration.
  3. To add a Custom Metric or Dimension, type out the name exactly in its entirety. If you try to search for it and add a Metric/Dimension that looks like this - ga:customMetricXX - you’ll run into issues.

    For example: let’s say you want to add custom metric 10 to the configuration. To add it, you would type ga:metric10 in the Choose Metrics field like this:

    Adding a custom metric.

  4. As you add Metrics and Dimensions, Stitch will check for compatibility. If there are any conflicts, you’ll need to resolve them before you can save the integration. Use Google’s Dimensions & Metrics Explorer as a guide when selecting Metrics and Dimensions.
  5. Review your selections. Remember: once saved, Metrics and Dimensions can’t be added or removed.
  6. When you’re finished, click Save Integration.

Initial and historical replication jobs

After you finish setting up Google Analytics, its Sync Status may show as Pending on either the Stitch Dashboard or in the Integration Details page.

For a new integration, a Pending status indicates that Stitch is in the process of scheduling the initial replication job for the integration. This may take some time to complete.

Free historical data loads

The first seven days of replication, beginning when data is first replicated, are free. Rows replicated from the new integration during this time won’t count towards your quota. Stitch offers this as a way of testing new integrations, measuring usage, and ensuring historical data volumes don’t quickly consume your quota.


Replicating Google Analytics Data

Every time Stitch runs a replication job for Google Analytics, the last 15 days’ worth of data will be replicated.

This is applicable to all tables in the integration.

Stitch replicates data in this way to account for updates made to existing records within the default attribution window of 15 days, thus ensuring you won’t make decisions based on stale (or false) data. As a result, you may see a higher number of replicated rows than what’s being generated in Google Analytics.

Setting the Replication Frequency to a higher frequency - like 30 minutes - can result in re-replicating recent data and contribute to greater row usage. Selecting a lower frequency can help keep your row count low.


Google Analytics Schema

A single table - called report - will be created in your data warehouse for each Google Analytics integration you create.

The schema of this table will be composed of the Metrics and Dimensions you selected during the setup process and two other columns: start_date and end_date.

If, for example, you selected the following Metrics and Dimensions during setup:

  • Metrics: ga:sessions, ga:pageviews
  • Dimensions: ga:referralPath, ga:country

The table would look like the SAMPLE_TABLE below.

Primary Keys

The Primary Key for a Google Analytics table is a composite key made up of the Dimension columns and the start_date and end_date columns.

For example: the Primary Key for the SAMPLE_TABLE would be: referralpath:country:start_date:end_date

Table Rows & Data Pagination

Google Analytics data is paginated on a daily basis. This means that a single row in the report table pertains to a specific day. Use the start_date and end_date columns to identify what day the row is for.


SAMPLE_TABLE

Replication Method: Key-based Incremental
Primary Key: referralpath:country:start_date:end_date
Contains Nested Structures?: No

This is a sample table. The section below contains the real info for the report table that will be created in your data warehouse.

Table Info & Attributes

SAMPLE_TABLE Attributes

If the ga:sessions, ga:pageviews, ga:referralPath, ga:country Metrics & Dimensions were selected during setup, the schema of the table would include these attributes:

  • sessions

  • pageviews

  • referralpath

  • country

  • start_date

  • end_date

report

Replication Method: Key-based Incremental
Primary Key: dimension_columns:start_date:end_date
Contains Nested Structures?: No

This is the table that will be created in your data warehouse. The columns in this table will be the Metrics and Dimensions you selected during setup.

Table Info & Attributes

Replication & Attribution Windows

Every time a replication job runs for Google Analytics, the past 15 days' worth of data will be replicated for this table. As a result, you may see a higher number of replicated rows than what's being generated in Google Analytics.

Stitch replicates data in this way to account for updates made to existing records within Google Analytics's default attribution window, thus ensuring you won't make decisions based on stale (or false) data.

report Attributes

For more info on the Metrics and Dimensions in Google Analytics, check out their documentation.

  • Metrics selected by you during setup

  • Dimensions selected by you during setup

  • start_date

  • end_date


Google Analytics & (not set) Values

According to Google’s documentation:

(not set) is a placeholder name that Analytics uses when it hasn’t received any information for a dimension.

In Google Analytics, the reasons for a (not set) value can vary depending on the report you’re looking at. More info for the various report types can be found here in Google’s documentation.



Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.