Which of the following Spark properties is used to configure the maximum size of an automatically broadcasted DataFrame when performing a join?
Which of the following Spark properties is used to configure the maximum size of an automatically broadcasted DataFrame when performing a join?
The property 'spark.sql.autoBroadcastJoinThreshold' is used to configure the maximum size of an automatically broadcasted DataFrame when performing a join. This property controls the threshold under which the DataFrame will be broadcast to all worker nodes, which can optimize join operations by reducing the amount of data shuffled between nodes.
The correct answer is B. spark.sql.autoBroadcastJoinThreshold. This property in Apache Spark is used to configure the maximum size (in bytes) of a table that will be broadcast to all worker nodes when performing a join. If the size of the table is below this threshold, it will be broadcasted, which can significantly speed up join operations.