Certified Data Engineer Professional Exam - Question 35

Question

To reduce storage and compute costs, the data engineering team has been tasked with curating a series of aggregate tables leveraged by business intelligence dashboards, customer-facing applications, production machine learning models, and ad hoc analytical queries.

The data engineering team has been made aware of new requirements from a customer-facing application, which is the only downstream workload they manage entirely. As a result, an aggregate table used by numerous teams across the organization will need to have a number of fields renamed, and additional fields will also be added.

Which of the solutions addresses the situation while minimally interrupting other teams in the organization without increasing the number of tables that need to be managed?

Examice · Accepted Answer

To accommodate schema changes while minimally disrupting other teams, the best approach is to configure a new table that includes the necessary fields and new names for the customer-facing application. Then, create a view that aliases select fields from the new table to maintain the original data schema and table name. This allows the data engineering team to meet the specific needs of the customer-facing application without altering the table structure that other teams rely on, thereby avoiding interruptions. Additionally, this method doesn't increase the number of tables that need to be managed, as the view serves to provide compatibility rather than adding another table requiring maintenance.

aksand13 · Answer

D. B has new table and view created.

alexvno · Answer

Create view. Can't be B as -> without increasing the number of tables that need to be managed

IWantCerts · Answer

I think it's B. D replaces original table definition with a view, which will run up compute costs for queries using the table.

guillesd · Answer

B makes way more sense, the number of tables managed do not increase since the old table won't be used anymore, then the view on top of this table is not another table to manage, just maintains the "original API" of the table to avoid breaking changes in downstream applications

TheGhost21 · Answer

Answer is D

sturcu · Answer

B is correct.

chokthewa · Answer

B is suitable for fact , don't interrupt the end-user , just managed by technical term. The technical team will create view refer field mapping .

Quadronoid · Answer

B is definitely the best option

hal2401me · Answer

in my exam today I chose D.

ThoBustos · Answer

to me it's b because by creating a new table + the view that will substitute the previous table we still have 1 table. It seems to be the most efficient way to solve this. Not 100% sure though

pravieee · Answer

I would go for B.

With option B you will run the aggregations once and store in in a table, then present these aggregations in the old schema in a view.

With D the aggregations will be done twice, for the old schema view and for the new table.

Certified Data Engineer Professional Exam - Question 35

Discussion