Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 137


The code block shown below should return a collection of summary statistics for column sqft in DataFrame storesDF. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.

Code block:

storesDF.__1__(__2__)

Show Answer
Correct Answer: D

The correct method to use for obtaining summary statistics for a column in a DataFrame in PySpark is `describe`. When applying describe, you need to pass the name of the column as a string. Therefore, the correct completion of the code block is storesDF.describe("sqft").

Discussion

1 comment
Sign in to comment
arullOption: B
Jan 27, 2024

Isn't it option B ? storesDF.describe(col("sqft"))

saryu
Feb 2, 2024

D is correct.