Certified Associate Developer for Apache Spark Exam - Question 63

Question

Which of the following Spark properties is used to configure whether skewed partitions are automatically detected and subdivided into smaller partitions when joining two DataFrames together?

Examice · Accepted Answer

The correct Spark property used to configure whether skewed partitions are automatically detected and subdivided into smaller partitions when joining two DataFrames together is spark.sql.adaptive.skewedJoin.enabled. This property helps in dynamically managing skew in sort-merge join by splitting skewed tasks into smaller, more evenly sized tasks.

Larrave · Answer

Answer should be A, but the config is skewJoin not skewedJoin

thanab · Answer

A

The Spark property used to configure whether skewed partitions are automatically detected and subdivided into smaller partitions when joining two DataFrames together is `spark.sql.adaptive.skewJoin.enabled`. This feature dynamically handles skew in sort-merge join by splitting (and replicating if needed) skewed tasks into roughly evenly sized tasks. It takes effect when both `spark.sql.adaptive.enabled` and `spark.sql.adaptive.skewJoin.enabled` configurations are enabled.

amirshaz · Answer

A is correct

Certified Associate Developer for Apache Spark Exam - Question 63

Discussion