Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 48


Which of the following Spark properties is used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle?

Show Answer
Correct Answer: E

The Spark property 'spark.sql.adaptive.coalescePartitions.enabled' is used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle. When this property is set to true, Spark condenses smaller partitions into larger ones to optimize the shuffle process.

Discussion

2 comments
Sign in to comment
4be8126Option: E
May 3, 2023

The answer is E. spark.sql.adaptive.coalescePartitions.enabled is the Spark property used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle. When set to true, Spark automatically coalesces partitions that are smaller than the configured minimum size into larger partitions to optimize shuffles.

juliom6Option: E
Nov 6, 2023

https://spark.apache.org/docs/latest/sql-performance-tuning.html spark.sql.adaptive.coalescePartitions.enabled: When true and spark.sql.adaptive.enabled is true, Spark will coalesce contiguous shuffle partitions according to the target size (specified by spark.sql.adaptive.advisoryPartitionSizeInBytes), to avoid too many small tasks.