Which of the following types of processes induces a stage boundary?
Which of the following types of processes induces a stage boundary?
A stage boundary in distributed computing environments, like Apache Spark, is induced when a shuffle occurs. A shuffle process requires data to be redistributed across different physical nodes or executors. This effectively splits the process into stages as data needs to be reorganized before the next set of operations can be performed.
A. Shuffle A stage boundary in Spark is induced when data needs to be shuffled across the executors. A shuffle occurs when the current operations require data to be redistributed so that new partitions of data are formed based on which downstream operations are computed.