Exam Professional Data Engineer All QuestionsBrowse all questions from this exam
Question 289

You have data located in BigQuery that is used to generate reports for your company. You have noticed some weekly executive report fields do not correspond to format according to company standards. For example, report errors include different telephone formats and different country code identifiers. This is a frequent issue, so you need to create a recurring job to normalize the data. You want a quick solution that requires no coding. What should you do?

    Correct Answer: A

    Given the requirement for a no-coding solution, Cloud Data Fusion and Wrangler provide a visual, no-code interface specifically designed for data transformation tasks. This allows users to design data workflows and normalization tasks without writing any code. Additionally, Cloud Data Fusion supports scheduling recurring jobs, which aligns perfectly with the need to automate the normalization process on a weekly basis.

Discussion
Matt_108Option: A

Definitely A, cloud data fusion and wrangler to setup the clean up pipeline with no coding required

Sofiia98Option: A

Cloud Data Fusion and Wrangler

scaenruyOption: A

A. Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.

987af6bOption: A

A. Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job. Explanation No Coding Required: Cloud Data Fusion's Wrangler offers a no-code interface for data transformation tasks. You can visually design data normalization workflows without writing any code. Recurring Jobs: Cloud Data Fusion allows you to schedule these data normalization tasks to run on a recurring basis, meeting your need for automation.

carmltekaiOption: D

The best solution here is D. Use BigQuery and GoogleSQL to normalize the data, and schedule recurring queries in BigQuery. Here's why: * No-code solution: BigQuery's built-in capabilities and GoogleSQL offer a no-code way to transform and standardize data. You can leverage functions like REGEXP_REPLACE to normalize phone numbers and FORMAT to ensure consistent formatting across fields. * Recurring jobs: BigQuery allows you to schedule queries to run regularly, which is perfect for maintaining data consistency over time. * Quick and efficient: BigQuery is designed for large-scale data processing, making it fast and efficient for normalization tasks.

carmltekai

Why other options aren't as suitable: A. Cloud Data Fusion and Wrangler: While powerful, these tools might be overkill for a simple normalization task and could involve a steeper learning curve. B. Dataflow SQL: Dataflow is primarily for stream processing and might not be the most efficient for batch transformations on data already in BigQuery. C. Dataproc Serverless: This involves using a Spark job, which requires coding and might be more complex than necessary for this task.

fitri001Option: A

https://cloud.google.com/data-fusion/docs

SohiniVOption: D

As per chatGPT, Option D allows you to utilize BigQuery's SQL capabilities to write queries that normalize the data according to company standards. You can then schedule these queries to run on a recurring basis using BigQuery's scheduled queries feature. This feature allows you to specify a schedule (e.g., weekly) for executing SQL queries automatically. This approach requires no additional setup or coding outside of BigQuery, making it a quick and straightforward solution to address the issue of data normalization.

SohiniV

Any views on this ?

RenePetersen

Wouldn't writing the SQL transformation be considered coding? The question specifically states that a solution requiring no coding is needed.

jreale64

While Cloud Data Fusion with Wrangler offers a visual interface for data wrangling, it requires setting up the environment and potentially writing code for ransformations. So it its not appropriate. I think D

JyoGCPOption: A

Option A