Which of the following Spark properties is used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle?
Which of the following Spark properties is used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle?
The Spark property 'spark.sql.adaptive.coalescePartitions.enabled' is used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle. When this property is set to true, Spark condenses smaller partitions into larger ones to optimize the shuffle process.
The answer is E. spark.sql.adaptive.coalescePartitions.enabled is the Spark property used to configure whether DataFrame partitions that do not meet a minimum size threshold are automatically coalesced into larger partitions during a shuffle. When set to true, Spark automatically coalesces partitions that are smaller than the configured minimum size into larger partitions to optimize shuffles.
https://spark.apache.org/docs/latest/sql-performance-tuning.html spark.sql.adaptive.coalescePartitions.enabled: When true and spark.sql.adaptive.enabled is true, Spark will coalesce contiguous shuffle partitions according to the target size (specified by spark.sql.adaptive.advisoryPartitionSizeInBytes), to avoid too many small tasks.