Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 137

The code block shown below should return a collection of summary statistics for column sqft in DataFrame storesDF. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.

Code block:

storesDF.__1__(__2__)

    Correct Answer: D

    The correct method to use for obtaining summary statistics for a column in a DataFrame in PySpark is `describe`. When applying describe, you need to pass the name of the column as a string. Therefore, the correct completion of the code block is storesDF.describe("sqft").

Discussion
arullOption: B

Isn't it option B ? storesDF.describe(col("sqft"))

saryu

D is correct.