Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 16

Which of the following operations can be used to create a new DataFrame that has 12 partitions from an original DataFrame df that has 8 partitions?

    Correct Answer: A

    The repartition operation can be used to change the number of partitions in a DataFrame, both increasing and decreasing the number. By using df.repartition(12), the DataFrame will be re-partitioned to have 12 partitions from its original 8 partitions. The other operations do not perform this function: df.cache() is for caching the DataFrame in memory, df.partitionBy() is not relevant for changing partition numbers, and df.coalesce() can only reduce the number of partitions, not increase them.

Discussion
4be8126Option: A

The answer is A. The repartition operation can be used to increase or decrease the number of partitions in a DataFrame. In this case, the number of partitions is being increased from 8 to 12, so we can use the repartition operation with a partition count of 12: df.repartition(12). Option B, df.cache(), is used to cache a DataFrame in memory for faster access, but it does not change the number of partitions. Option C, df.partitionBy(1.5), is not a valid operation for partitioning a DataFrame. Option D, df.coalesce(12), can be used to reduce the number of partitions in a DataFrame, but it cannot be used to increase the number of partitions beyond the current number. Option E, df.partitionBy(12), is used to partition a DataFrame by a specific column or set of columns, but it does not change the number of partitions.

NuclearGandhiOption: A

nice explanation @4be8126

TmDataOption: A

The operation that can be used to create a new DataFrame with 12 partitions from an original DataFrame df that has 8 partitions is: D. df.coalesce(12) Explanation: The coalesce() operation in Spark is used to decrease the number of partitions in a DataFrame, and it can be used to create a new DataFrame with a specific number of partitions. In this case, calling df.coalesce(12) on the original DataFrame df with 8 partitions will create a new DataFrame with 12 partitions.

SonicBoom10C9Option: A

Comprehensive explanation by 4be8126, only using this comment to vote A.