Where in the Spark UI can one diagnose a performance problem induced by not leveraging predicate push-down?
Where in the Spark UI can one diagnose a performance problem induced by not leveraging predicate push-down?
Predicate push-down optimization reduces the amount of data read from the data source by applying filters early in the query process. To diagnose a performance problem caused by not leveraging predicate push-down, one should look at the size of the data read. The Stage’s Detail screen in the Completed Stages table shows the size of data read from the Input column, which would reveal if too much data is being read because predicate push-down is not being used.
Query plan. Correct is E