Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 99


The code block shown below contains an error. The code block is intended to return the exact number of distinct values in column division in DataFrame storesDF. Identify the error.

Code block:

storesDF.agg(approx_count_distinct(col(“division”)).alias(“divisionDistinct”))

Show Answer
Correct Answer: AE

The approx_count_distinct() operation is specifically designed to provide an approximate count of distinct values rather than an exact count. Therefore, it cannot determine the exact number of distinct values in a column.

Discussion

3 comments
Sign in to comment
Ram459Option: E
Aug 16, 2023

can not get exact distinct using apox function

newusername
Nov 9, 2023

agree, should be E

thanabOption: E
Sep 14, 2023

E The error in the code block is that the approx_count_distinct() operation cannot determine an exact number of distinct values in a column.

azure_bimonsterOption: E
Feb 9, 2024

storesDF.agg(countDistinct("division").alias("divisionDistinct")) can give an exact distinct value unlike E option