Professional Machine Learning Engineer Exam - Question 173

Question

You received a training-serving skew alert from a Vertex AI Model Monitoring job running in production. You retrained the model with more recent training data, and deployed it back to the Vertex AI endpoint, but you are still receiving the same alert. What should you do?

Examice · Accepted Answer

You should update the model monitoring job to use the more recent training data that was used to retrain the model. This ensures that the model monitoring job reflects the characteristics of the latest training data and aligns the baseline distribution used for monitoring with the current distribution of production data. This alignment can help prevent false positive alerts related to training-serving skew.

BlehMaks · Answer

the cause of the issue could be that the developer forgot to switch their monitoring job to the latest training dataset and the monitoring job still compares prod data with old training dataset and they of course have a skew

36bdc1e · Answer

B
This option can help align the baseline distribution of the model monitoring job with the current distribution of the production data, and eliminate the false positive alerts.

b1a8fae · Answer

A. Changing the sampling rate affects not training skew but cost efficiency: https://cloud.google.com/vertex-ai/docs/model-monitoring/overview#considerations
B. The model monitoring job is already using the most recent data to detect skew.
C&D are the same, except for D being more specific, so I would tend towards D.

pikachu007 · Answer

B. Update the model monitoring job to use the more recent training data that was used to retrain the model:

This option directly aligns the model monitoring with the recently retrained model and ensures that the monitoring job reflects the characteristics of the latest training data.

pinimichele01 · Answer

This option can help align the baseline distribution of the model monitoring job with the current distribution of the production data, and eliminate the false positive alerts.

SahandJ · Answer

Is B actually the correct answer? According to the documentation, training-serving skew detection can only be enabled if the original training data is available. Furthermore, the baseline is automatically recalculated when the training data is updated.

So does this question imply that the model is trained on data without updating the original training-dataset? If so then B is clearly correct. If they updated the training dataset with new data and then retrained the model then the model monitoring job's baseline should automatically have been recalculated. I see no other valid answers in that case?

info_appsatori · Answer

The baseline is calculated when you create a Vertex AI Model Monitoring job, and is only recalculated if you update the training dataset for the job.

AzureDP900 · Answer

C. Temporarily disable the alert. Enable the alert again after a sufficient amount of new production traffic has passed through the Vertex AI endpoint.

Here's why:

You've already retrained the model with more recent training data and deployed it back to the Vertex AI endpoint, but the alert persists.
This suggests that the model is still adapting to the changing data distribution in production.
Temporarily disabling the alert will give the model a chance to adjust to the new data distribution before the monitoring job starts firing alerts again.
Once enough new traffic has passed through, you can re-enable the alert and continue monitoring the model's performance.

Professional Machine Learning Engineer Exam - Question 173

Discussion