Associate Data Practitioner Exam QuestionsBrowse all questions from this exam

Associate Data Practitioner Exam - Question 3


Your company is building a near real-time streaming pipeline to process JSON telemetry data from small appliances. You need to process messages arriving at a Pub/Sub topic, capitalize letters in the serial number field, and write results to BigQuery. You want to use a managed service and write a minimal amount of code for underlying transformations. What should you do?

Show Answer
Correct Answer:

Discussion

6 comments
Sign in to comment
trashboxOption: C
Jan 22, 2025

A UDF of the Dataflow is a simpler coding option than a Cloud Run.

rich_maverickOption: C
Feb 26, 2025

I agree that C is the best answer. However, answer A is doable and is also low/no code and also considered acceptable.

bc3f222Option: A
Feb 28, 2025

Pub/Sub to BQ is now the recommended solution, no longer need dataflow

n2183712847Option: C
Mar 6, 2025

The best option is C. Use the “Pub/Sub to BigQuery” Dataflow template with a UDF. Option C is best because Dataflow templates are managed, serverless, and designed for streaming Pub/Sub data to BigQuery. UDFs allow minimal code for transformations within the pipeline. Option A (Pub/Sub to BigQuery + scheduled query) is incorrect because scheduled queries are not real-time transformations. Option B (Pub/Sub to Cloud Storage + Cloud Run) is incorrect because it adds unnecessary complexity with Cloud Storage as an intermediary and is not truly streaming. Option D (Pub/Sub push + Cloud Run) is incorrect because while real-time, it requires more code in Cloud Run than using a Dataflow UDF and is less purpose-built for data pipelines than Dataflow. Therefore, Option C, Dataflow template with UDF, is the best balance of managed service, minimal code, and near real-time streaming.

JAGLeesOption: C
Mar 29, 2025

Cloud Run is not minimal code (or recommended for Data Pipelines) A scheduled job is not "near realtime" So the answer is Dataflow with a UDF which gives a scalable managed solution with minimal code

NishantRanjanKhawareOption: C
May 3, 2025

Pub/Sub with UDF is having least coding overhead