Which of the following DataFrame operations is classified as a wide transformation?
Which of the following DataFrame operations is classified as a wide transformation?
A wide transformation in the context of DataFrame operations involves shuffling or redistributing data across partitions, typically requiring data movement across the network. DataFrame.join() is classified as a wide transformation because it involves combining two DataFrames based on a common key column, which often necessitates shuffling and redistributing the data between partitions.
B. DataFrame.join() is classified as a wide transformation, as it shuffles the data across the network to perform the join operation.
None of the other options are wide transformations, they are narrow (logically, they modify the length of a dataframe). Only a join can force shuffling of data between horizontally scaled partitions.
The DataFrame operation classified as a wide transformation is: B. DataFrame.join() Explanation: In Spark, transformations are operations on DataFrames that create a new DataFrame from an existing one. Wide transformations involve shuffling or redistributing data across partitions and typically require data movement across the network. Among the options provided, DataFrame.join() is a wide transformation because it involves combining two DataFrames based on a common key column, which often requires shuffling and redistributing the data across partitions.