Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 99

The code block shown below contains an error. The code block is intended to return the exact number of distinct values in column division in DataFrame storesDF. Identify the error.

Code block:

storesDF.agg(approx_count_distinct(col(“division”)).alias(“divisionDistinct”))

    Correct Answer: E

    The approx_count_distinct() operation is specifically designed to provide an approximate count of distinct values rather than an exact count. Therefore, it cannot determine the exact number of distinct values in a column.

Discussion
Ram459Option: E

can not get exact distinct using apox function

newusername

agree, should be E

azure_bimonsterOption: E

storesDF.agg(countDistinct("division").alias("divisionDistinct")) can give an exact distinct value unlike E option

thanabOption: E

E The error in the code block is that the approx_count_distinct() operation cannot determine an exact number of distinct values in a column.