Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 244


Your work for a textile manufacturing company. Your company has hundreds of machines, and each machine has many sensors. Your team used the sensory data to build hundreds of ML models that detect machine anomalies. Models are retrained daily, and you need to deploy these models in a cost-effective way. The models must operate 24/7 without downtime and make sub millisecond predictions. What should you do?

Show Answer
Correct Answer: BD

Deploying a Dataflow streaming pipeline with the RunInference API and using automatic model refresh is the most suitable approach for this scenario. This solution ensures continuous real-time processing of sensor data, which is essential for making sub-millisecond predictions and detecting machine anomalies promptly. The RunInference API allows the models to be invoked directly within the pipeline, minimizing latency and eliminating the need for separate prediction endpoints, which can be more cost-effective. Automatic model refresh ensures that the latest retrained models are always in use without downtime, maintaining the accuracy and effectiveness of anomaly detection.

Discussion

6 comments
Sign in to comment
fitri001Option: D
Apr 17, 2024

why D? Real-time Predictions: Dataflow streaming pipelines continuously process sensor data, enabling real-time anomaly detection with sub-millisecond predictions. This is crucial for immediate response to potential machine issues. RunInference API: This API allows invoking TensorFlow models directly within the Dataflow pipeline for on-the-fly inference. This eliminates the need for separate prediction endpoints and reduces latency. Automatic Model Refresh: Since models are retrained daily, automatic refresh ensures the pipeline utilizes the latest version without downtime. This is essential for maintaining model accuracy and anomaly detection effectiveness. Why not C? Dataflow Streaming Pipeline with Vertex AI Prediction Endpoint with Autoscaling: While autoscaling can handle varying workloads, Vertex AI Prediction endpoints might incur higher costs for real-time, high-volume predictions compared to invoking models directly within the pipeline using RunInference.

b1a8faeOption: D
Jan 20, 2024

Needs to be active 24/7 -> streaming. RunInference API seems like the way to go here, using automatic model refresh on a daily basis. https://beam.apache.org/documentation/ml/about-ml/

guilhermebutzkeOption: C
Feb 19, 2024

My Answer: C The phrase: “The models must operate 24/7 without downtime and make sub millisecond predictions” configures a case of online prediction (option B or C) The phrase: “Models are retrained daily, and you need to deploy these models in a cost-effective way”, choose between “ Vertex AI Prediction endpoint with autoscaling” instead “Runlnference API, and use automatic model refresh” looks better because always update with retrained models, and the scalability. https://cloud.google.com/blog/products/ai-machine-learning/streaming-prediction-with-dataflow-and-vertex

sonicclaspsOption: C
Jan 31, 2024

low latency - > streaming C & D could both work, but C is the GCP solution. So I chose C

vaibavi
Feb 11, 2024

i think autoscaling will lead to downtime atleast when the replicas are updating .

pinimichele01
Apr 28, 2024

i agree, D is better

asmgi
Jul 14, 2024

I don't think autoscaling is relevant to this task, since we have the same amount of sensors at any time.

pinimichele01Option: D
Apr 8, 2024

With the automatic model refresh feature, when the underlying model changes, your pipeline updates to use the new model. Because the RunInference transform automatically updates the model handler, you don't need to redeploy the pipeline. With this feature, you can update your model in real time, even while the Apache Beam pipeline is running.

pinimichele01
Apr 13, 2024

and also ai endpoint not good for online inference

gscharlyOption: D
Apr 20, 2024

agree with fitri001