Certified Data Engineer Professional Exam QuestionsBrowse all questions from this exam

Certified Data Engineer Professional Exam - Question 28


A junior data engineer seeks to leverage Delta Lake's Change Data Feed functionality to create a Type 1 table representing all of the values that have ever been valid for all rows in a bronze table created with the property delta.enableChangeDataFeed = true. They plan to execute the following code as a daily job:

Which statement describes the execution and results of running the above query multiple times?

Show Answer
Correct Answer: B

The provided code uses Delta Lake's Change Data Feed (CDF) functionality to read and filter changes from the beginning (startingVersion, 0) each time it runs. It filters for update postimage and insert changes only and appends the filtered results to the target table. Due to starting from version 0 every time, the entire history of inserted or updated records will be read and appended to the target table on each execution, leading to many duplicate entries.

Discussion

12 comments
Sign in to comment
asmayassinegOption: A
Aug 2, 2023

Answer is A, since the df is filtering on updated records using update_postimage filter

taif12340
Aug 23, 2023

it's B: Reading table’s changes, captured by CDF, using spark.read means that you are reading them as a static source. So, each time you run the query, all table’s changes (starting from the specified startingVersion) will be read.

mht3336
Jan 25, 2024

there is also insert in the filter.

asmayassinegOption: B
Aug 2, 2023

sorry, answer is correct B.

azurearchOption: B
Sep 10, 2023

B is the right answer, sorry.

kz_dataOption: B
Jan 10, 2024

B is correct

PrashantTiwariOption: B
Feb 9, 2024

B is correct

imatheushenriqueOption: B
Jun 4, 2024

("startingVersion", 0) that means the entiry history of table will be read so B.

azurearchOption: A
Sep 8, 2023

answer is A, because there is a filter as asmayassineg said. Filter filters only existing records from change feed

sturcuOption: B
Oct 11, 2023

correct

jyothsna12496Option: B
Oct 18, 2023

why is it Not E. It gets newly inserted or updated records

[Removed]
Dec 4, 2023

I'm with you, follow the reference: https://docs.delta.io/latest/delta-change-data-feed.html#:~:text=_change_type,update_preimage%20%2C%20update_postimage

5ffcd04
Jan 1, 2024

Notice .option ("startingVersion", 0), which will bring all changes from begining. Hence Answer is B.

[Removed]Option: E
Dec 4, 2023

Considering that we are talking about Change Data Feed and the code is filtering by[ "update_postimage", "insert" ] the column "_change_type", I would go with the option E. Reference: https://docs.delta.io/latest/delta-change-data-feed.html#:~:text=_change_type,update_preimage%20%2C%20update_postimage

5ffcd04
Jan 1, 2024

Notice option ("startingVersion", 0), which will bring all changes from begining. Hence Answer is B.

azurelearn2020Option: B
Dec 9, 2023

correct answer is B.

5ffcd04Option: B
Jan 1, 2024

Correct B