Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 128

Which of the following identifies multiple narrow operations that are executed in sequence?

    Correct Answer: C

    A stage in Spark represents a sequence of narrow transformations that are executed in sequence without requiring a shuffle across the entire dataset. Narrow transformations imply that each partition of the parent RDD contributes to only one partition of the child RDD, and Spark groups as many of these narrow transformations as possible into a single stage to optimize performance. This allows for efficient execution of the operations.

Discussion
Sowwy1Option: C

C. Stage In Spark, a stage represents a sequence of narrow transformations that can be executed without shuffling the entire data across partitions. Narrow transformations are those where each partition of the parent RDD contributes to only one partition of the child RDD. Spark groups as many narrow transformations as possible into a single stage, and these operations are pipelined together to optimize performance. A stage is completed before the next stage begins, and usually, a shuffle operation would define the boundary between stages.