Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 185


You have developed a BigQuery ML model that predicts customer chum, and deployed the model to Vertex AI Endpoints. You want to automate the retraining of your model by using minimal additional code when model feature values change. You also want to minimize the number of times that your model is retrained to reduce training costs. What should you do?

Show Answer
Correct Answer: AD

To automate the retraining of your BigQuery ML model efficiently, create a Vertex AI Model Monitoring job configured to monitor training/serving skew. This setup will detect discrepancies between the training data and the data your model processes in production, ensuring that your model is retrained only when necessary. Configuring alert monitoring to publish messages to a Pub/Sub queue when a skew alert is detected, and using a Cloud Function to monitor the queue and trigger retraining in BigQuery, helps minimize additional code and reduces training costs by focusing on relevancy and need-based retraining.

Discussion

15 comments
Sign in to comment
guilhermebutzkeOption: D
Feb 18, 2024

My answer: D Given the emphasis on "model feature values change" in the question, the most suitable option would be D. Although option C involves monitoring prediction drift, which may indirectly capture changes in feature values, option D directly addresses the need to monitor training/serving skew. By detecting discrepancies between the training and serving data distributions, option D is more aligned with the requirement to automate retraining when model feature values change. Therefore, option D is the most suitable choice in this context.

b1a8faeOption: D
Jan 15, 2024

I would avoid using TensorFlow validation to minimize code written. That leaves us with options C and D. Now, since it is the values of the features that we want to flag and not the value of the predictions, this sounds more like training-serving skew situation than prediction drift. Hence, I would go for D.

CHARLIE2108Option: D
Feb 16, 2024

changed my mind it's D

bobjrOption: C
Jun 6, 2024

Skew should be detected at the beginning of the productionalisation of the model -> skew test the training data Vs the real data -> a skew indicates you trained in a dataset that is not alined with your data that you have in input Drift is used when the model works well at the beginning, but the world change and the data input changes -> drift is more long term here it is a drift issue

Prakzz
Jun 30, 2024

Agreed

vale_76_na_xxxOption: C
Jan 8, 2024

I go with : C. 1. Create a Vertex AI Model Monitoring job configured to monitor prediction drift - > if the modle is already in production we have to considet Prediction drift 2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected -> set Pub/Sub notification channels. 3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery -> to eimport new data in BQ

36bdc1eOption: C
Jan 13, 2024

C The best option for automating the retraining of your model by using minimal additional code when model feature values change, and minimizing the number of times that your model is retrained to reduce training costs, is to create a Vertex AI Model Monitoring job configured to monitor prediction drift, configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected, and use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery. This option allows you to leverage the power and simplicity of Vertex AI, Pub/Sub, and Cloud Functions to monitor your model performance and retrain your model when needed. Vertex AI is a unified platform for building and deploying machine learning solutions on Google Cloud.

b1a8faeOption: C
Jan 15, 2024

After reconsidering, I think it is C: - No need to use TF to enable model monitoring as stated here: https://cloud.google.com/vertex-ai/docs/model-monitoring/using-model-monitoring (even if it uses it under the hood: https://cloud.google.com/vertex-ai/docs/model-monitoring/overview#calculating-skew-and-drift) - The problem speaks about alerting of model feature changes, which happens over time, and uses a baseline of the historical production data -> prediction skew. (if the problem specified that it changes compared to training data, then it would be training-skew) (https://cloud.google.com/vertex-ai/docs/model-monitoring/monitor-explainable-ai#feature_attribution_training-serving_skew_and_prediction_drift)

ddoggOption: C
Jan 31, 2024

Option C: This option directly addresses your requirements: Vertex AI Model Monitoring: It allows efficient monitoring of prediction drift through metrics like Mean Squared Error or AUC-ROC. Pub/Sub alerts: Alert triggers notification upon significant drift, minimizing unnecessary retraining. Cloud Function: It reacts to Pub/Sub messages and triggers retraining in BigQuery using minimal additional code.

CHARLIE2108Option: C
Feb 9, 2024

I go with C but D is pretty similar. C -> Prediction drift (When the overall distribution of predictions changes significantly between training and serving data). D -> Training/serving skew (When the distribution of specific features between training and serving data differs significantly).

CHARLIE2108
Feb 16, 2024

It's D

pikachu007Option: C
Jan 11, 2024

A and B: TensorFlow Data Validation jobs require more setup and maintenance, and they might not integrate as seamlessly with Vertex AI Endpoints for automated retraining. D: Monitoring training/serving skew focuses on differences between training and deployment environments, which might not directly address feature value changes.

BlehMaksOption: D
Jan 14, 2024

we might need to retrain if the feature data distribution in the production and training are significantly different(training/serving skew). Prediction drift occurs when feature data distribution in production changes significantly over time. Should we retrain our model every time when we meet prediction drift? I dont think so, better to analyze why this drift happens. https://cloud.google.com/vertex-ai/docs/model-monitoring/overview#considerations

BlehMaksOption: D
Jan 14, 2024

i've changed my mind) it's D https://www.evidentlyai.com/blog/machine-learning-monitoring-data-and-concept-drift

pinimichele01Option: D
Apr 7, 2024

It's D

pinimichele01
Apr 23, 2024

see guilhermebutzke

gscharlyOption: D
Apr 17, 2024

I go with D

ShnoOption: D
Apr 30, 2024

if the model training is done through bigquery ML, we don't have access to the training data after export, so I don't understand how training/serving skew can be applied. Can someone who is voting in favour of D clarify?