Certified Data Engineer Associate Exam QuestionsBrowse all questions from this exam

Certified Data Engineer Associate Exam - Question 32


A dataset has been defined using Delta Live Tables and includes an expectations clause:

CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION DROP ROW

What is the expected behavior when a batch of data containing data that violates these constraints is processed?

Show Answer
Correct Answer: CD

When a batch of data contains records that violate the expectation, the records are dropped from the target dataset and recorded as invalid in the event log. This ensures data integrity by excluding invalid records from the dataset and keeping a log for tracking data quality issues.

Discussion

14 comments
Sign in to comment
XiltroXOption: C
Apr 2, 2023

I am simply appalled by the number of wrong answers in this series of questions. The statement in the question already says "ON VIOLATE DROP ROW" which means if condition is violated, there will be nothing saved to quarantine table and a log of all invalid entries will be recoded. All invalid data that doesn't meet condition will be dropped. So C is the correct answer.

rafahbOption: C
Apr 4, 2023

C is correct

surrabhi_4Option: C
Apr 3, 2023

option C

SHINGXOption: B
Apr 13, 2023

B is correct. This question is number 35 on the practice test on databricks patner academy. https://partner-academy.databricks.com/ correct answer is "Records that violate the expectation are added to the target dataset and recorded as invalid in the event log"

SHINGX
Apr 13, 2023

Sorry, D

SHINGX
Apr 14, 2023

I was wrong, the ON VIOLATION DROP ROW makes C the correct answer

AndreFROption: C
Aug 19, 2023

https://docs.databricks.com/en/delta-live-tables/expectations.html

vctrhugoOption: C
Sep 4, 2023

C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log. With the defined constraint and expectation clause, when a batch of data is processed, any records that violate the expectation (in this case, where the timestamp is not greater than '2020-01-01') will be dropped from the target dataset. These dropped records will also be recorded as invalid in the event log, allowing for auditing and tracking of the data quality issues without causing the entire job to fail.

GarynOption: C
Dec 30, 2023

C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log. Explanation: The defined expectation specifies that if the timestamp is not greater than '2020-01-01', the row will be considered in violation of the constraint. The ON VIOLATION DROP ROW clause states that rows that violate the constraint will be dropped from the target dataset. Additionally, the expectation clause will log these violations in the event log, indicating which records did not meet the specified constraint criteria. This behavior ensures that the rows failing the defined constraint are not included in the target dataset and are logged as invalid in the event log for reference or further investigation, maintaining data integrity within the dataset based on the specified constraints.

mehroosaliOption: C
Jul 7, 2023

C is correct

AtnafuOption: C
Jul 8, 2023

C When a batch of data is processed in Delta Live Tables and contains data that violates the defined expectations or constraints, the expected behavior is that the records violating the expectation are dropped from the target dataset. Additionally, these violated records are recorded as invalid in the event log.

DavidRouOption: C
Oct 31, 2023

Right answer: C Invalid rows will be dropped as requested by the constraint and flagged as such in log files. If you need a quarantine table, you'll have to write more code.

HuroyeOption: C
Nov 15, 2023

who choses these answers? The correct answer is C. The record is dropped. This is not about the default behavior. It is explicit.

SerGreyOption: C
Jan 8, 2024

C is correct

benni_aleOption: C
Apr 28, 2024

C is correct

3fbc31bOption: C
Jul 8, 2024

C is the correct answer. The DROP ROW clause will cause them to NOT be added to the destination; only marked in the log.