Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 148


Which of the following operations is least likely to result in a shuffle?

Show Answer
Correct Answer: B

The operation DataFrame.filter() is least likely to result in a shuffle because it simply applies a condition to filter the rows within each partition. This operation doesn't require redistributing data across the partitions, unlike operations like join, orderBy, distinct, or intersect, which typically require data movement across partitions.

Discussion

1 comment
Sign in to comment
Sowwy1Option: B
Apr 10, 2024

B. DataFrame.fliter()