Online serving types

Vertex AI Feature Store offers the following types of online serving that you can use to serve features for online predictions:

Bigtable online serving

Bigtable online serving is suitable for large data volumes (in the order of terabytes of data) with high data durability. It's comparable to online serving in Vertex AI Feature Store (Legacy) but isn't optimized to rapidly adjust to sudden bursts of traffic.

Generally, Bigtable online serving has higher latency than Optimized online serving, but is more cost-efficient. Bigtable online serving supports both scheduled and continuous data sync for its feature views.

Note that Bigtable online serving doesn't support embeddings management. If you want to manage and serve embeddings, use Optimized online serving.

To use Bigtable online serving, you need to perform the following steps:

  1. Create an online store for Bigtable online serving.

  2. Create a feature view instance.

  3. Fetch feature values using Bigtable online serving.

Optimized online serving

Optimized online serving lets you serve features at latencies that are significantly lower than Bigtable online serving. It provides an online serving architecture that's faster, more scalable, and more responsive to increased data volumes. Optimized online serving is suitable in scenarios where it's critical to serve features at ultra-low latencies.

With Optimized online serving, you can serve feature values from either a public endpoint or a Private Service Connect endpoint.

All online store instances created for Optimized online serving support embeddings management.

Optimized online serving supports scheduled data sync, but doesn't support continuous data sync for feature views. If you want to use continuous data sync to sync data from the BigQuery data source to your feature views in near real-time, use Bigtable online serving.

Optimized online serving with public endpoint

By default, an online store created for Optimized online serving lets you serve features with a public endpoint. To use Optimized online serving with a public endpoint, you need to perform the following steps:

  1. Create an online store for Optimized online serving with a public endpoint.

  2. Create a feature view instance.

  3. Fetch feature values using Optimized online serving from a public endpoint.

Optimized online serving with Private Service Connect endpoint

A Private Service Connect endpoint is a dedicated serving endpoint. Use a Private Service Connect endpoint if you want to serve features within a VPC network at lower latencies than a public endpoint. To use Optimized online serving with a Private Service Connect endpoint, you need to perform the following steps:

  1. Create an online store for Optimized online serving with a Private Service Connect endpoint.

  2. Create a feature view instance.

  3. Fetch feature values using Optimized online serving from the Private Service Connect endpoint.

Set up online serving to serve null feature values or only non-null feature values

During online serving, if you want to serve only the latest feature values, including null values, you must use the following setup:

  1. Register your feature data source by creating a feature group with the dense parameter set to true.

  2. Choose Bigtable online serving when you create the online store instance.

  3. Use the cron parameter to set up scheduled data sync when you create your feature views.

If you use any other configuration while setting up your feature data source and online serving, Vertex AI Feature Store serves only the latest non-null feature values. If the latest value of a feature is null, then Vertex AI Feature Store serves the most recent non-null historical value for the feature. If a non-null historical value isn't available, then Vertex AI Feature Store serves null as the feature value.

What's next