Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 74

QUESTION NO: 75 -

Which of the following code blocks returns a DataFrame where column divisionDistinct is the approximate number of distinct values in column division from DataFrame storesDF?

    Correct Answer: C

    To return a DataFrame with a column named divisionDistinct that contains the approximate number of distinct values in the column division from storesDF, the correct code block is storesDF.agg(approx_count_distinct(col('division')).alias('divisionDistinct')). This uses the agg function to perform an aggregation on the entire DataFrame, and approx_count_distinct to compute the approximate count of distinct values, with alias used to rename the resulting column.

Discussion
Sowwy1Option: C

I think it's C https://spark.apache.org/docs/3.1.2/api/python/reference/api/pyspark.sql.functions.approx_count_distinct.html