Certified Data Engineer Professional Exam - Question 51

Question

Where in the Spark UI can one diagnose a performance problem induced by not leveraging predicate push-down?

Examice · Accepted Answer

Predicate push-down optimization reduces the amount of data read from the data source by applying filters early in the query process. To diagnose a performance problem caused by not leveraging predicate push-down, one should look at the size of the data read. The Stage’s Detail screen in the Completed Stages table shows the size of data read from the Input column, which would reveal if too much data is being read because predicate push-down is not being used.

P1314 · Answer

Query plan. Correct is E

Certified Data Engineer Professional Exam - Question 51

Discussion