Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 35


Which of the following code blocks returns a collection of summary statistics for all columns in

DataFrame storesDF?

Show Answer
Correct Answer: E

The describe() method in DataFrame returns a DataFrame with summary statistics for all numeric columns in the input DataFrame. These statistics include count, mean, standard deviation, minimum, and maximum values. By calling storesDF.describe(), it provides a comprehensive summary without needing additional parameters.

Discussion

8 comments
Sign in to comment
zozoshankyOption: E
Jul 30, 2023

E is correct, it's giving the output.

4be8126Option: B
Apr 26, 2023

The answer is B. Explanation: The describe() method in DataFrame returns a DataFrame with summary statistics for all numeric columns in the input DataFrame. By default, only the count, mean, standard deviation, minimum, and maximum values are returned, but additional statistics can be specified with the percentiles parameter. Setting the all parameter to True will include non-numeric columns in the output as well. Therefore, option B is the correct answer. Option A is not correct, as the summary() method only returns summary statistics for the specified column(s) and is not a valid option for returning summary statistics for all columns in the DataFrame. Option C is not correct, as the describe() method does not have an "all" option. Option D is also not correct, as the summary() method only returns summary statistics for the specified column(s) and does not have an "all" option. Option E is not incorrect, but it does not specify whether to include non-numeric columns in the output. Therefore, option B is a better answer.

ZSun
Jun 6, 2023

Did you really try this in pyspark, or look up the document? TypeError: describe() got an unexpected keyword argument 'all'

8605246
Jun 30, 2023

describe() is correct

Deuterium
Jul 7, 2023

Is you answer from Chat GPT ?

cookiemonster42
Aug 3, 2023

even chat gpt says E is the correct one :)

juadaves
Oct 19, 2023

TypeError Traceback (most recent call last) <ipython-input-34-5077330dead7> in <cell line: 1>() ----> 1 storesDF.describe(all = True) TypeError: DataFrame.describe() got an unexpected keyword argument 'all'

zozoshankyOption: B
Jul 23, 2023

B is correct. On running the last option it gives error. TypeError: describe() got an unexpected keyword argument 'all'

cookiemonster42
Aug 3, 2023

checked it, it gave me the right result, so E is the one

cookiemonster42Option: E
Aug 3, 2023

check the documentation, mates. both methods receive names of columns as arguments, so E is correct!

souha_axaOption: E
Aug 16, 2023

E is the correct answer

mahmoud_salah30Option: E
Dec 31, 2023

tested e is the right answer

azure_bimonsterOption: E
Feb 8, 2024

E would be correct here

dbdantasOption: E
Apr 9, 2024

E is the correct one