Professional Machine Learning Engineer Exam QuestionsBrowse all questions from this exam

Professional Machine Learning Engineer Exam - Question 255


You have recently used TensorFlow to train a classification model on tabular data. You have created a Dataflow pipeline that can transform several terabytes of data into training or prediction datasets consisting of TFRecords. You now need to productionize the model, and you want the predictions to be automatically uploaded to a BigQuery table on a weekly schedule. What should you do?

Show Answer
Correct Answer: AC

To productionize the model and ensure that predictions are automatically uploaded to a BigQuery table on a weekly schedule, you should import the model into Vertex AI and deploy it to a Vertex AI endpoint. Additionally, create a Dataflow pipeline that can reuse the data processing logic to send requests to the endpoint and then upload the predictions to a BigQuery table. This approach leverages the existing Dataflow pipeline and ensures efficient processing and integration with BigQuery.

Discussion

14 comments
Sign in to comment
BlehMaksOption: C
Jan 27, 2024

The DataflowPythonJobOp operator lets you create a Vertex AI Pipelines component that prepares data by submitting a Python-based Apache Beam job to Dataflow for execution. https://cloud.google.com/vertex-ai/docs/pipelines/dataflow-component#dataflowpythonjobop Using we can specify an output location for Vertex AI to store predictions results https://cloud.google.com/vertex-ai/docs/pipelines/batchprediction-component A - is incorrect since we dont need an endpoint for batch predictions B - creating a new Dataflow pipeline is redundant

pertoiseOption: C
Feb 29, 2024

Answer is C. No need for an endpoint here : Simply specify the BigQuery table URI in the ModelBatchPredictOp parameter and you're done automatically uploading to BigQuery

gscharlyOption: C
Apr 19, 2024

No need to deploy to endpoint as we need batch predictions. ModelBatchPredictOp can upload data to BQ. Dataflow pipeline logic can be implemented in DataflowPythonJobOp

guilhermebutzkeOption: B
Feb 16, 2024

My Answer: B The most complete answer, and reuse a created pipeline. Don’t make sense to use DataflowPythonJobOp when you have already created a dataflow pipeline that does the same.

pinimichele01Option: C
Apr 14, 2024

ModelBatchPredictOp -> upload automatically on BQ No need for endpoint --> C

fitri001Option: B
Apr 17, 2024

Option A: Vertex AI Pipelines' ModelBatchPredictOp is designed for batch prediction within pipelines, not for serving models through an endpoint. Option C: Importing the model directly into BigQuery is not feasible for TensorFlow models. Option D: Vertex AI Pipelines' BigqueryPredictModelJobOp assumes the model is already trained and hosted in BigQuery ML, which isn't the case here.

pinimichele01
Apr 17, 2024

Importing the model directly into BigQuery is not feasible for TensorFlow models. -> not true

rcapjOption: B
Jun 21, 2024

B Vertex AI Deployment: Vertex AI provides a managed environment for deploying machine learning models. It simplifies the process and ensures scalability. Dataflow Pipeline Reuse: Reusing the existing Dataflow pipeline for data processing leverages your existing code and avoids redundant logic. Model Endpoint Predictions: Sending requests to the deployed model endpoint allows for efficient prediction generation. BigQuery Upload: Uploading predictions directly to BigQuery from the Dataflow pipeline integrates seamlessly with your data storage.

pikachu007Option: B
Jan 13, 2024

Option A: Vertex AI Pipelines are excellent for orchestrating ML workflows but might not be as efficient as Dataflow for large-scale data processing, especially with existing Dataflow logic. Option C: While Vertex AI Pipelines can handle model loading and prediction, Dataflow is better suited for large-scale data processing and BigQuery integration. Option D: BigQuery ML is primarily for in-database model training and prediction, not ideal for external models or large-scale data processing.

daidai75Option: B
Jan 23, 2024

The answer is B, optional A and B doesn't mention how to import prediction result to BigQuery.

tavva_prudhviOption: B
Feb 10, 2024

Not A, C as they does not explicitly mention how the predictions will be uploaded to BigQuery.

pinimichele01Option: C
Apr 8, 2024

agree with BlehMaks

fitri001Option: B
Apr 17, 2024

TFRecords is a specific file format designed by TensorFlow for storing data in a way that's efficient for the machine learning framework. Here are some key points about TFRecords:

PrakzzOption: B
Jul 4, 2024

Only option B talks about loading the data to BigQuery

AzureDP900Option: B
Jul 5, 2024

B is right because 1)You've already trained a classification model using TensorFlow, so you need to productionize it by deploying it to a Vertex AI endpoint. 2)To automate the prediction process on a weekly schedule, you can create a Dataflow pipeline that reuses your existing data processing logic. This pipeline will send requests to the deployed model for inference and then upload the predicted results to BigQuery.