Professional Machine Learning Engineer Exam - Question 74

Question

You have deployed a model on Vertex AI for real-time inference. During an online prediction request, you get an “Out of Memory” error. What should you do?

Examice · Accepted Answer

During an 'Out of Memory' error in an online prediction request, the error suggests that the data being sent in each request is too large and exceeds the available memory. To address this, sending the request again with a smaller batch of instances can reduce the amount of data processed at a time, potentially avoiding the out-of-memory error and successfully completing the prediction request.

hiromi · Answer

B is the answer
429 - Out of Memory
https://cloud.google.com/ai-platform/training/docs/troubleshooting

koakande · Answer

https://cloud.google.com/ai-platform/training/docs/troubleshooting

tavva_prudhvi · Answer

B. Send the request again with a smaller batch of instances.

If you are getting an "Out of Memory" error during an online prediction request, it suggests that the amount of data you are sending in each request is too large and is exceeding the available memory. To resolve this issue, you can try sending the request again with a smaller batch of instances. This reduces the amount of data being sent in each request and helps avoid the out-of-memory error. If the problem persists, you can also try increasing the machine type or the number of instances to provide more resources for the prediction service.

Sivaram06 · Answer

https://cloud.google.com/ai-platform/training/docs/troubleshooting#http_status_codes

LearnSodas · Answer

answer B as reported here: https://cloud.google.com/ai-platform/training/docs/troubleshooting

ares81 · Answer

The correct answer is B.

BenMS · Answer

This question is about prediction not training - and specifically it's about _online_ prediction (aka realtime serving).

All the answers are about batch workloads apart from C.

M25 · Answer

Went with B

pmle_nintendo · Answer

By reducing the batch size of instances sent for prediction, you decrease the memory footprint of each request, potentially alleviating the out-of-memory issue. However, be mindful that excessively reducing the batch size might impact the efficiency of your prediction process.

PhilipKoku · Answer

B) Use smaller set of tokens

Professional Machine Learning Engineer Exam - Question 74

Discussion