Create materialized view replicas

This document describes how to create materialized view replicas in BigQuery. A materialized view replica is a replication of external Amazon Simple Storage Service (Amazon S3), Apache Iceberg, or Salesforce Data Cloud data in a BigQuery dataset so that the data is available locally in BigQuery, which can help you avoid data egress costs and improve query performance.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Make sure that billing is enabled for your Google Cloud project.

  6. Ensure that you have the required Identity and Access Management (IAM) permissions to perform the tasks in this document.

Required roles

To get the permissions that you need to perform the tasks in this document, ask your administrator to grant you the BigQuery Admin (roles/bigquery.admin) IAM role. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to perform the tasks in this document. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to perform the tasks in this document:

  • bigquery.tables.create
  • bigquery.tables.get
  • bigquery.tables.getData
  • bigquery.tables.replicateData
  • bigquery.jobs.create

You might also be able to get these permissions with custom roles or other predefined roles.

For more information about BigQuery IAM, see Introduction to IAM in BigQuery.

Prepare a dataset for materialized view replicas

Before creating a materialized view replica, you must complete the following tasks:

  1. Create a dataset in a region that supports Amazon S3
  2. Create a source table in the dataset you created in the preceding step. The source table can be any of the following table types:

Create materialized view replicas

Select one of the following options:

Console

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the Explorer pane, navigate to the project and dataset where you want to create the materialized view replica, and then click View actions > Create table.

  3. In the Source section of the Create table dialog, do the following:

    1. For Create table from, select Existing table/view.
    2. For Project, enter the project where the source table or view is located.
    3. For Dataset, enter the dataset where the source table or view is located.
    4. For View, enter the source table or view that you are replicating. If you choose a view, it must be an authorized view, or if not, all tables that are used to generate that view must be located in the view's dataset.
  4. Optional: For Local materialized view max staleness, enter a max_staleness value for your local materialized view.

  5. In the Destination section of the Create table dialog, do the following:

    1. For Project, enter the project in which you want to create the materialized view replica.
    2. For Dataset, enter the dataset in which you want to create the materialized view replica.
    3. For Replica materialized view name, enter a name for your replica.
  6. Optional: Specify tags and advanced options for your materialized view replica. If you don't specify a dataset for Local Materialized View Dataset, then one is automatically created in the same project and region as the source data and named bq_auto_generated_local_mv_dataset. If you don't specify a name for Local Materialized View Name, then one is automatically created in the same project and region as the source data and given the prefix bq_auto_generated_local_mv_.

  7. Click Create table.

A new local materialized view is created (if it wasn't specified) and authorized in the source dataset. Then the materialized view replica is created in the destination dataset.

SQL

  1. Create a materialized view over the base table in the dataset that you created. You can also create the materialized view in a different dataset that is in an Amazon S3 region.
  2. Authorize the materialized view on the datasets that contain the source tables used in the query that created the materialized view.
  3. If you configured manual metadata cache refreshing for the source table, run the BQ.REFRESH_EXTERNAL_METADATA_CACHE system procedure to refresh the metadata cache.
  4. Run the BQ.REFRESH_MATERIALIZED_VIEW system procedure to refresh the materialized view.
  5. Create materialized view replicas by using the CREATE MATERIALIZED VIEW AS REPLICA OF statement:

    CREATE MATERIALIZED VIEW PROJECT_ID.BQ_DATASET.REPLICA_NAME
    OPTIONS(replication_interval_seconds=REPLICATION_INTERVAL)
    AS REPLICA OF PROJECT_ID.S3_DATASET.MATERIALIZED_VIEW_NAME;

    Replace the following:

    • PROJECT_ID: the name of your project in which you want to create the materialized view replica—for example, myproject.
    • BQ_DATASET: the name of the BigQuery dataset that you want to create the materialized view replica in—for example, bq_dataset. The dataset must be in the BigQuery region that maps to the region of the source materialized view.
    • REPLICA_NAME: the name of the materialized view replica that you want to create—for example, my_mv_replica.
    • REPLICATION_INTERVAL: specifies how often to replicate the data from the source materialized view to the replica, in seconds. Must be a value between 60 and 3,600, inclusive. Defaults to 300 (5 minutes).
    • S3_DATASET: the name of the dataset that contains the source materialized view—for example, s3_dataset.
    • MATERIALIZED_VIEW_NAME: the name of the materialized view to replicate—for example, my_mv.

    The following example creates a materialized view replica named mv_replica in bq_dataset:

    CREATE MATERIALIZED VIEW `myproject.bq_dataset.mv_replica`
    OPTIONS(
    replication_interval_seconds=600
    )
    AS REPLICA OF `myproject.s3_dataset.my_s3_mv`

After you create the materialized view replica, the replication process polls the source materialized view for changes and replicates data to the materialized view replica, refreshing the data at the interval you specified in the replication_interval_seconds or max_staleness option. If you query the replica before the first backfill completes, you get a backfill in progress error. You can query the data in the materialized view replica after the first replication completes.