Professional Data Engineer Exam QuestionsBrowse all questions from this exam

Professional Data Engineer Exam - Question 204


You want to create a machine learning model using BigQuery ML and create an endpoint for hosting the model using Vertex AI. This will enable the processing of continuous streaming data in near-real time from multiple vendors. The data may contain invalid values. What should you do?

Show Answer
Correct Answer: AD

To enable the processing of continuous streaming data in near-real time from multiple vendors and handle potential invalid values, the best approach is to create a Pub/Sub topic and send all vendor data to it. Dataflow can be used to process and sanitize this data before streaming it to BigQuery. This pipeline allows for scalable, real-time processing and ensures data is appropriately cleaned before being utilized by the BigQuery ML model and hosted in Vertex AI.

Discussion

9 comments
Sign in to comment
AtnafuOption: D
Nov 30, 2022

Answer is D

vidtsOption: D
Dec 1, 2022

It's D

jkhongOption: D
Dec 4, 2022

Better to use pubsub for streaming and reading message data Dataflow ParDo can perform filtering of data

odacirOption: D
Dec 9, 2022

D is the best option to sanitize the data to its D

vamgcpOption: D
Jul 23, 2023

Option D -Dataflow provides a scalable and flexible way to process and clean the incoming data in real-time before loading it into BigQuery.

zellckOption: D
Dec 3, 2022

D is the answer.

AzureDP900Option: D
Jan 2, 2023

D. Create a Pub/Sub topic and send all vendor data to it. Use Dataflow to process and sanitize the Pub/Sub data and stream it to BigQuery.

Matt_108Option: D
Jan 13, 2024

Option D

anyone_99Option: A
Jul 9, 2024

Why is the answer A? After paying $44 I am getting wrong answers.