Certified Data Engineer Professional Exam QuestionsBrowse all questions from this exam

Certified Data Engineer Professional Exam - Question 128


An upstream system is emitting change data capture (CDC) logs that are being written to a cloud object storage directory. Each record in the log indicates the change type (insert, update, or delete) and the values for each field after the change. The source table has a primary key identified by the field pk_id.

For auditing purposes, the data governance team wishes to maintain a full record of all values that have ever been valid in the source system. For analytical purposes, only the most recent value for each record needs to be recorded. The Databricks job to ingest these records occurs once per hour, but each individual record may have changed multiple times over the course of an hour.

Which solution meets these requirements?

Show Answer
Correct Answer: AD

The correct solution for maintaining both a comprehensive historical log and ensuring the most recent state for analytical purposes is to use Delta Lake’s change data feed. This feature automates the processing of CDC data from external systems by capturing all changes (inserts, updates, deletes) and propagating them to all dependent tables in the Lakehouse. This approach ensures a detailed history is maintained while efficiently updating the current state for analytics.

Discussion

3 comments
Sign in to comment
FreyrOption: D
Jun 1, 2024

Correct Answer: D Delta Lake’s change data feed feature is specifically designed to handle CDC scenarios. It processes data from external systems, tracking all changes (inserts, updates, deletes) and maintaining a detailed history of these changes. This feature allows for keeping a comprehensive log while also ensuring the most recent state is correctly reflected in analytical tables.

BrianNguyen95Option: D
Jun 5, 2024

Delta Lake provides built-in change data feed functionality. It captures changes (inserts, updates, deletes) and propagates them to dependent tables. By using Delta Lake, you can maintain historical records and propagate changes efficiently.

Ati1362Option: D
Jun 24, 2024

agree with D