Certified Associate Developer for Apache Spark Exam - Question 148

Question

Which of the following operations is least likely to result in a shuffle?

Examice · Accepted Answer

The operation DataFrame.filter() is least likely to result in a shuffle because it simply applies a condition to filter the rows within each partition. This operation doesn't require redistributing data across the partitions, unlike operations like join, orderBy, distinct, or intersect, which typically require data movement across partitions.

Sowwy1 · Answer

B. DataFrame.fliter()

Certified Associate Developer for Apache Spark Exam - Question 148

Discussion