Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 22


Which of the following code blocks returns a new DataFrame from DataFrame storesDF where column storeId is of the type string?

Show Answer
Correct Answer: B

In order to return a new DataFrame from DataFrame storesDF where the column storeId is of the type string, the correct usage of the withColumn method along with the col function and the cast method in PySpark is required. The correct syntax is storesDF.withColumn('storeId', col('storeId').cast(StringType())). This correctly transforms the storeId column to be of type string. Option B closely matches this syntax but has a minor typo. Therefore, the intended correct option is B.

Discussion

4 comments
Sign in to comment
dduque10Option: B
May 15, 2023

All answers are wrong because the first argument does not have the closing quotes :D, apart from that, it is B

4be8126Option: B
Apr 26, 2023

The correct code block to return a new DataFrame from DataFrame storesDF where column storeId is of the type string is: storesDF.withColumn("storeId", col("storeId").cast(StringType())) Option A has an extra quotation mark after "storeId" and is missing a closing parenthesis for the cast() function. Option B correctly uses the cast() function to change the data type, but has a typo where "storeId" is repeated inside the string argument for the withColumn() function. Option C is missing the col() function to reference the storeId column, and also has a typo with the closing parentheses for the cast() function. Option D correctly references the storeId column using col(), but has a typo with the quotation marks and parentheses. Option E has a syntax error where the cast() function is inside the quotation marks, and is also missing the col() function to reference the storeId column. Therefore, the correct answer is B. storesDF.withColumn("storeId", col("storeId").cast(StringType()))

ZSunOption: B
Jun 6, 2023

cast is a method belongs to class pyspark.sql.column therefore, A C E are wrong. it should be dataframe.column.cast() or col('col_name').cast() B is correct, with small typo

DataEngineOption: B
Oct 29, 2023

Anwer is B but it has a typo