Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 219


Your company manages an ecommerce website. You developed an ML model that recommends additional products to users in near real time based on items currently in the user’s cart. The workflow will include the following processes:

1. The website will send a Pub/Sub message with the relevant data and then receive a message with the prediction from Pub/Sub

2. Predictions will be stored in BigQuery

3. The model will be stored in a Cloud Storage bucket and will be updated frequently

You want to minimize prediction latency and the effort required to update the model. How should you reconfigure the architecture?

Show Answer
Correct Answer: D

To minimize prediction latency and reduce the effort required to update the model, using RunInference API with WatchFilePattern in a Dataflow job is the optimal solution. The RunInference API allows for efficient stream processing with low latency, while WatchFilePattern enables automatic updates to the model stored in Cloud Storage without needing to redeploy or manually manage endpoint updates, ensuring seamless model refreshes.

Discussion

6 comments
Sign in to comment
guilhermebutzkeOption: D
Feb 14, 2024

My answer: D This Google Documentation explains “Instead of deploying the model to an endpoint, you can use the RunInference API to serve machine learning models in your Apache Beam pipeline. This approach has several advantages, including flexibility and portability.” https://cloud.google.com/blog/products/ai-machine-learning/streaming-prediction-with-dataflow-and-vertex This documentation uses RunInference and WatchFilePattern to “to automatically update the ML model without stopping the Apache Beam”. https://cloud.google.com/dataflow/docs/notebooks/automatic_model_refresh So, thinking in “minimize prediction latency”, its suggested use RunInfenrece, while “effort required to update the model” the **WatchFilePattern is the best approach.** I think D is the best option

ddoggOption: D
Feb 5, 2024

Automatic Model Updates: WatchFilePattern automatically detects model changes in Cloud Storage, leading to seamless updates without managing endpoint deployments.

pikachu007Option: A
Jan 13, 2024

Low Latency: Serverless Execution: Cloud Functions start up almost instantly, reducing prediction latency compared to alternatives that require longer setup or deployment times. In-Memory Model: Loading the model into memory eliminates disk I/O overhead, further contributing to rapid predictions.

CHARLIE2108
Feb 6, 2024

Cloud Functions offer low latency but it might not scale well.

Yan_XOption: A
Mar 10, 2024

A for me.

pinimichele01Option: D
Apr 16, 2024

agree with guilhermebutzke

PhilipKokuOption: C
Jun 10, 2024

C) Expose the model as Vertex AI End Point