Exam Certified Data Engineer Professional All QuestionsBrowse all questions from this exam
Question 11

The data engineering team has configured a job to process customer requests to be forgotten (have their data deleted). All user data that needs to be deleted is stored in Delta Lake tables using default table settings.

The team has decided to process all deletions from the previous week as a batch job at 1am each Sunday. The total duration of this job is less than one hour. Every Monday at 3am, a batch job executes a series of VACUUM commands on all Delta Lake tables throughout the organization.

The compliance officer has recently learned about Delta Lake's time travel functionality. They are concerned that this might allow continued access to deleted data.

Assuming all delete logic is correctly implemented, which statement correctly addresses this concern?

    Correct Answer: E

    Because the default data retention threshold is 7 days, data files containing deleted records will be retained until the VACUUM job is run 8 days later. Delta Lake retains a 7-day history to support operations like time travel. As a result, even after deletion, the data remains accessible via time travel until the retention period expires and a VACUUM operation is performed to permanently remove the records.

Discussion
asmayassinegOption: E

Answer is E, default retention period is 7 days https://learn.microsoft.com/en-us/azure/databricks/delta/vacuum

mardigrasOption: A

The answer has to be A. The deletion is done on Sunday 1am and then the next day Monday 3am, VACUUM was initiated, so one can only time travel for about 24 hours.

EertyyOption: E

e is right answer

aragorn_bregoOption: E

Delta Lake's time travel feature allows you to query an older snapshot of a table. By default, Delta Lake retains a 7-day history for the table to support operations like time travel. When data is deleted from a Delta table, the actual data files are not immediately removed from the storage layer; they are just marked for deletion. The VACUUM command is used to clean up these files that are no longer in the state of the table, but it will not remove any files that fall within the retention period unless it is run with an override option to reduce the retention period. Thus, if the deletions are processed on Sunday and the VACUUM command is run on Monday without overriding the default retention period, the deleted records would still be accessible via time travel for approximately 8 days (until the next run of the VACUUM command after the data has aged past the 7-day retention period).

juliom6Option: A

Si bien la data es borrada (DELETE) el domingo, aún se puede recuperar ella mediante time traveling, sólo el día siguiente (lunes) se eliminará esta posibilidad debido a que se ejecuta el VACUUM, en consecuencia la data se podrá recuperar en ese lapso de 24 horas aprox

BIKRAM063Option: E

Answer is E

03355a2Option: A

They expect the deleted records for the previous week to be deleted Sunday from 1am to 2am. Then the next day(Monday) at 3am approx 24hrs later, the vacuum command is ran. This means the records from the previous week are only around for 24ish hours before they are removed with the vacuum command. They aren't waiting 8 days to run the command, there fore E is wrong.

imatheushenriqueOption: E

E. Because the default data retention threshold is 7 days, data files containing deleted records will be retained until the VACUUM job is run 8 days later.

coercionOption: E

Default retention period is 7 days so newly deleted data on Sunday will be available for next 7 days (even if vacuum was run on Monday as it will delete 7 days old data and not the data that was loaded yesterday "Sunday" )

TayariOption: E

The default retention threshold for data files after running VACUUM is 7 days.

hedbergareOption: E

Answer is E

RiktRikt007Option: E

if i v0: create table, v1: insert 2 reocrds, v2: insert 2 record, v3: delete 2 records, and then run the vacuum command (with default 7 day retention), the delete records will be there and you can access using SELECT * FROM delta_table VERSION AS OF 2;

spaceexplorerOption: E

Answer is E

kz_dataOption: E

Answer is E

kz_dataOption: E

Answer is E as the default retention period is 7 days

RafaelCFCOption: E

Correct according to the documentation: https://docs.databricks.com/en/sql/language-manual/delta-vacuum.html

hamzaKhribiOption: E

Correct answer is E, In this question tables are with default settings and giving delta retention is 7 days the data will still be accessible for the last 7 days.