Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 63


Which of the following Spark properties is used to configure whether skewed partitions are automatically detected and subdivided into smaller partitions when joining two DataFrames together?

Show Answer
Correct Answer: AB

The correct Spark property used to configure whether skewed partitions are automatically detected and subdivided into smaller partitions when joining two DataFrames together is spark.sql.adaptive.skewedJoin.enabled. This property helps in dynamically managing skew in sort-merge join by splitting skewed tasks into smaller, more evenly sized tasks.

Discussion

3 comments
Sign in to comment
LarraveOption: A
Jun 22, 2023

Answer should be A, but the config is skewJoin not skewedJoin

thanabOption: A
Sep 14, 2023

A The Spark property used to configure whether skewed partitions are automatically detected and subdivided into smaller partitions when joining two DataFrames together is `spark.sql.adaptive.skewJoin.enabled`. This feature dynamically handles skew in sort-merge join by splitting (and replicating if needed) skewed tasks into roughly evenly sized tasks. It takes effect when both `spark.sql.adaptive.enabled` and `spark.sql.adaptive.skewJoin.enabled` configurations are enabled.

amirshazOption: A
Jan 24, 2024

A is correct