Exam Certified Data Engineer Associate All QuestionsBrowse all questions from this exam
Question 36

A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at which table in their pipeline the data is being dropped.

Which of the following approaches can the data engineer take to identify the table that is dropping the records?

    Correct Answer: D

    The data engineer can navigate to the Delta Live Tables pipeline page, click on each table, and view the data quality statistics. This will show information about records dropped, violations of expectations, and other data quality metrics specific to each table. By examining these statistics, the data engineer can determine at which table the data is being dropped.

Discussion
vctrhugoOption: D

D. They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics. To identify the table in a Delta Live Tables (DLT) pipeline where data is being dropped due to quality concerns, the data engineer can navigate to the DLT pipeline page, click on each table in the pipeline, and view the data quality statistics. These statistics often include information about records dropped, violations of expectations, and other data quality metrics. By examining the data quality statistics for each table in the pipeline, the data engineer can determine at which table the data is being dropped.

prasiosoOption: D

Think answer is D. The pipeline is configured to drop invalid records, i.e. a SQL equivalent query with a ON VIOLATION DROP ROW clause. This will not result in a failed pipeline execution because there are no errors. Instead, you'd have to go to each table and review the quality charactistics.

Atnafu

Option D is incorrect because viewing the data quality statistics for each table will not help the data engineer identify which table is dropping the records. The data quality statistics will show the overall quality of the data in each table, but they will not show which table is dropping the records. For example, if the data quality statistics for a table show that 10% of the records are invalid, this does not mean that 10% of the records are being dropped. The invalid records could be being updated, inserted, or deleted.

XiltroXOption: D

The correct answer is D

[Removed]

Is this for v2 or v3

DiewrineOption: D

E is for when an error occur. But pipeline is defined to drop some records that will not result on error

AtnafuOption: E

E When records are dropped due to quality concerns in a DLT pipeline, the errors are logged in the event log. The data engineer can navigate to the DLT pipeline page and click on the “Error” button to view the present errors. The errors will show the table where the records were dropped. Option A: Setting up separate expectations for each table will not help the data engineer determine which table is dropping the records. Option B: The data engineer cannot determine which table is dropping the records without looking at the event log. Option C: Setting up DLT to notify the data engineer via email when records are dropped will not help the data engineer determine which table is dropping the records. Option D: Viewing the data quality statistics for each table will not help the data engineer determine which table is dropping the records.

DavidRou

Don't you have to select a table generated in a single step of the pipeline to access the errors through the buttton though? Probably D is the right one here

3fbc31bOption: D

Correct answer is "D".

benni_aleOption: D

I would say D but I have never really tested it, still other solutions smell wrong

agAshishOption: D

D is correct By clicking on each table in the DLT pipeline page, the data engineer may be able to access data quality statistics, error logs, or other information related to dropped records. This can help them pinpoint at which table in the pipeline the data is being dropped.

awofalusOption: D

D is correct