Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 19

The code block shown below contains an error. The code block is intended to return a DataFrame containing all columns from DataFrame storesDF except for column sqft and column customerSatisfaction. Identify the error.

Code block:

storesDF.drop(sqft, customerSatisfaction)

    Correct Answer: D

    The code block is intended to drop two columns from the DataFrame storesDF. The drop function in a DataFrame expects column names to be provided as strings. Therefore, the column names should be quoted like "sqft" and "customerSatisfaction". The corrected code block should be: storesDF.drop("sqft", "customerSatisfaction").

Discussion
4be8126Option: D

The error in the code block is that the column names sqft and customerSatisfaction should be quoted, like "sqft" and "customerSatisfaction", since they are strings. The correct code block should be: storesDF.drop("sqft", "customerSatisfaction") Option D correctly identifies this error.

ZSun

The correct one is B: storesDF.drop("sqft").drop("customerSatisfaction") For D, it should be list of column name: storesDF.drop(["sqft", "customerSatisfaction"])

ZSun

The correct one is D, but my explanation is correct

azurearchOption: D

sorry, Option D is correct

azurearchOption: A

https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.drop.html option A is correct, drop expects only one argument, if its more than one, you would have to use as listofcols=['col1','col2'] and drop(*listofcols)

zozoshankyOption: D

D is correct, df.drop('id','firstname').show() tested code

TmDataOption: D

When using the drop() operation in Spark DataFrame, the column names should be specified as strings and enclosed in quotes. In the given code block, the column names "sqft" and "customerSatisfaction" are not quoted, which results in a syntax error.