Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 62


Which of the following Spark properties is used to configure the maximum size of an automatically broadcasted DataFrame when performing a join?

Show Answer
Correct Answer: B

The property 'spark.sql.autoBroadcastJoinThreshold' is used to configure the maximum size of an automatically broadcasted DataFrame when performing a join. This property controls the threshold under which the DataFrame will be broadcast to all worker nodes, which can optimize join operations by reducing the amount of data shuffled between nodes.

Discussion

1 comment
Sign in to comment
thanabOption: B
Sep 16, 2023

The correct answer is B. spark.sql.autoBroadcastJoinThreshold. This property in Apache Spark is used to configure the maximum size (in bytes) of a table that will be broadcast to all worker nodes when performing a join. If the size of the table is below this threshold, it will be broadcasted, which can significantly speed up join operations.