Which of the following code blocks fails to return the number of rows in DataFrame storesDF for each distinct combination of values in column division and column storeCategory?
Which of the following code blocks fails to return the number of rows in DataFrame storesDF for each distinct combination of values in column division and column storeCategory?
The code block that fails to return the number of rows in DataFrame storesDF for each distinct combination of values in column division and column storeCategory is the one that attempts to group by column division and then tries to group by storeCategory on an already grouped DataFrame. This results in an error since 'GroupedData' object doesn't have a groupBy method. Therefore, the correct option is storesDF.groupBy('division').groupBy('storeCategory').count().
B is the right choice. I tested with my dataframe option B threw this error AttributeError: 'GroupedData' object has no attribute 'groupBy'
D is correct !
it is possible to run in pyspark - storesDF.groupBy("division", "storeCategory").count() Correct answer is B