Professional Machine Learning Engineer Exam - Question 262

Question

You recently deployed a model to a Vertex AI endpoint and set up online serving in Vertex AI Feature Store. You have configured a daily batch ingestion job to update your featurestore. During the batch ingestion jobs, you discover that CPU utilization is high in your featurestore’s online serving nodes and that feature retrieval latency is high. You need to improve online serving performance during the daily batch ingestion. What should you do?

Examice · Accepted Answer

To address the high CPU utilization and high feature retrieval latency during batch ingestion jobs, the best solution is to schedule an increase in the number of online serving nodes in your featurestore before the batch ingestion jobs. This approach ensures that the necessary resources are available to handle the increased load, thus improving online serving performance. Enabling autoscaling might not react quickly enough to sudden spikes in traffic during batch ingestion, and increasing the worker_count in the batch ingestion job could further strain the online serving nodes.

b1a8fae · Answer

Vertex AI Feature Store provides two options for online serving: Bigtable and optimized online serving. Both options support autoscaling, which means that the number of online serving nodes can automatically adjust to the traffic demand. By enabling autoscaling, you can improve the online serving performance and reduce the feature retrieval latency during the daily batch ingestion. Autoscaling also helps you optimize the cost and resource utilization of your featurestore.

daidai75 · Answer

https://cloud.google.com/vertex-ai/docs/featurestore/managing-featurestores?&_gl=1*sswg5e*_ga*NDE2OTc3OTAzLjE3MDU4OTQ5OTE.*_ga_WH2QY8WWF5*MTcwNTkzNDM0NS40LjAuMTcwNTkzNDM0NS4wLjAuMA..&_ga=2.242492743.-416977903.1705894991#online_serving_nodes

pikachu007 · Answer

Option A: Manually scheduling node increases requires prior knowledge of batch ingestion times and might not be as responsive to unexpected workload spikes.
Option C: Autoscaling prediction nodes in the Vertex AI endpoint might help with model prediction latency but doesn't directly address feature retrieval latency from the featurestore.
Option D: Increasing worker_count in the batch ingestion job could speed up ingestion but might further strain online serving nodes, potentially worsening latency.

cruise93 · Answer

This question is valid for the Legacy feature store.
https://cloud.google.com/vertex-ai/docs/featurestore/ingesting-batch#import_job_performance

bobjr · Answer

Gemini + Perplexity ai + ChatGPT votes A

Because : B. Enable Autoscaling: While autoscaling can be useful, it might not react quickly enough to sudden spikes in traffic during batch ingestion. Scheduling the increase ensures that the resources are available when needed.

Prakzz · Answer

https://cloud.google.com/vertex-ai/docs/featurestore/managing-featurestores
Specifically mentioned here that --> If CPU utilization is consistently high, consider increasing the number of online serving nodes for your featurestore.

Professional Machine Learning Engineer Exam - Question 262

Discussion