Skip to main content

Connect Google BigQuery

Last updated on

Warehouse Native ExperimentationA method of running feature management experiments directly within your data warehouse, leveraging its processing power and existing data infrastructure. allows you to run experiments on data that already lives in your data warehouseA centralized repository for storing and managing large volumes of structured and semi-structured data. Examples include Snowflake, BigQuery, Redshift, and Databricks.. By connecting Harness FME directly to your Google BigQuery instance, you can securely query and analyze experiment data from your source of truth.

To begin, connect your Google BigQuery instance as a data source.

Prerequisites

Ensure that you have the following before getting started:

  • Access to your Google Cloud project with BigQuery enabled
  • A dataset containing your experiment data
  • A designated results table where experiment results are stored
  • A service account with Read access to assignment and metric source tables, and Write access to a results table
  • A service account key in JSON format

Setup

Harness recommends the following best practices:

  • Use a service account rather than a personal Google Cloud user.
  • Grant read-only access to the datasets containing the assignment and event data.
  • Grant write access only to the experiment results table.

To integrate BigQuery as a data warehouse for Warehouse Native Experimentation:

  1. From the Harness FME navigation menu, click FME Settings and click View on a project on the Projects page. Then, navigate to the Data Source tab.

  2. Select BigQuery as your data warehouse. In the Data Sources tab of your Harness FME project, select BigQuery from the list of supported data warehouses.

    Project experimentation type

    A project uses a single experimentation type based on the metric source used.

    When you add a data source to a project, the project’s experimentation type is set to Warehouse Native. All metrics in the project must then use Warehouse Native metric sources.

    If a project instead uses metrics created from an ingested event source, the project’s experimentation type is set to Cloud.

  3. Enter the following connection details:

    FieldDescriptionExample
    Project IDYour Google Cloud project ID.my-gcp-project
    DatasetThe BigQuery dataset containing experiment data.analytics_dataset
    Service Account EmailThe Google Cloud service account used for authentication.whn-team-access@project-id.iam.gserviceaccount.com
    Results Table NameThe table where experiment results are stored.metric_results

    Harness FME respects BigQuery IAM permissions. The connection only has access to resources granted to the service account.

  4. Provide authentication credentials by clicking Upload file to upload a JSON key file for your service account or clicking Paste text to enter the JSON key contents. Ensure the key corresponds to the service account email provided.

  5. Test the connection by clicking Test Connection. Harness FME confirms the following:

    • The service account credentials are valid.
    • The dataset exists and is accessible.
    • The service account has required read and write permissions.

    If the test fails, verify that:

    • The service account has sufficient BigQuery permissions.
    • The Project ID and dataset are correct.
    • The JSON key is valid and active.
  6. Select a dataset. After authentication, you can browse available datasets and tables based on your permissions. Select the dataset containing your assignment and metric source tables.

  7. Specify a results table. Create a results table where Harness FME will write experiment analysis results, and ensure that:

    • The table exists in your database.
    • The schema matches the expected format for experiment results below.

    FieldTypeDescription
    METRICIDSTRINGUnique identifier for the metric being calculated.
    METRICNAMESTRINGHuman-readable name of the metric being calculated.
    METRICRESULTIDSTRINGUnique identifier representing a specific calculation per metric, per experiment, per analysis run.
    EXPIDSTRINGUnique identifier for the experiment associated with this metric calculation.
    EXPNAMESTRINGHuman-readable name of the experiment associated with this metric calculation.
    TREATMENTSTRINGThe experiment variant (e.g., Control or Treatment) associated with the metric results.
    DIMENSIONNAMESTRINGThe name of the dimension being analyzed (e.g., country, platform).
    DIMENSIONVALUESTRINGThe corresponding value of the analyzed dimension.
    ATTRIBUTEDKEYSCOUNTINT64Count of unique keys (users, sessions, etc.) attributed to this metric result.
    REQUESTTIMESTAMPTIMESTAMPTimestamp when the metric computation request occurred.
    MINFLOAT64Minimum observed value for the metric.
    MAXFLOAT64Maximum observed value for the metric.
    COUNTINT64Total number of observations included in the metric calculation.
    SUMFLOAT64Sum of all observed metric values.
    MEANFLOAT64Average (mean) of the metric values.
    P50FLOAT6450th percentile (median) metric value.
    P95FLOAT6495th percentile metric value.
    P99FLOAT6499th percentile metric value.
    VARIANCEFLOAT64Variance of the metric values.
    EXCLUDEDUSERCOUNTINT64Number of users excluded from the analysis (due to filters, SRM, etc.).
    ASOFTIMESTAMPTIMESTAMPTimestamp representing when the result snapshot was written.

    To create the results table with the correct structure, run the following SQL statement in Google BigQuery:

    CREATE OR REPLACE TABLE `your_project.your_dataset.metric_results` (
    METRICID STRING,
    METRICNAME STRING,
    METRICRESULTID STRING,
    EXPID STRING,
    EXPNAME STRING,
    TREATMENT STRING,
    DIMENSIONNAME STRING,
    DIMENSIONVALUE STRING,
    ATTRIBUTEDKEYSCOUNT INT64,
    REQUESTTIMESTAMP TIMESTAMP,
    MIN FLOAT64,
    MAX FLOAT64,
    COUNT INT64,
    SUM FLOAT64,
    MEAN FLOAT64,
    P50 FLOAT64,
    P95 FLOAT64,
    P99 FLOAT64,
    VARIANCE FLOAT64,
    EXCLUDEDUSERCOUNT INT64,
    ASOFTIMESTAMP TIMESTAMP
    );
  8. Save and activate. Once the test passes, click Save to create the connection.

Your BigQuery data source can now be used to create assignment and metric sources for Warehouse Native Experimentation.

Example BigQuery configuration

SettingExample
VendorBigQuery
Project IDmy-gcp-project
Datasetanalytics_dataset
Service Accountfme-sa@project.iam.gserviceaccount.com
Results Tablemetric_results