Certified Data Engineer Professional Exam QuestionsBrowse all questions from this exam

Certified Data Engineer Professional Exam - Question 130


The business intelligence team has a dashboard configured to track various summary metrics for retail stores. This includes total sales for the previous day alongside totals and averages for a variety of time periods. The fields required to populate this dashboard have the following schema:

For demand forecasting, the Lakehouse contains a validated table of all itemized sales updated incrementally in near real-time. This table, named products_per_order, includes the following fields:

Because reporting on long-term sales trends is less volatile, analysts using the new dashboard only require data to be refreshed once daily. Because the dashboard will be queried interactively by many users throughout a normal business day, it should return results quickly and reduce total compute associated with each materialization.

Which solution meets the expectations of the end users while controlling and limiting possible costs?

Show Answer
Correct Answer: A

Given that the dashboard only needs to be refreshed once daily, configuring a nightly batch job to extract and save the required values for the dashboard is the most efficient solution. This approach ensures that the data is up-to-date with minimal compute resources involved, as the data extraction and updating occur only once a day. Options involving live dashboards, views, or in-memory tables would increase compute costs and complexity without providing significant benefits for a dashboard that does not require real-time updates.

Discussion

1 comment
Sign in to comment
MDWPartnersOption: A
May 29, 2024

Seems correct