Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 150

The code block shown below should return a new DataFrame where column productСategories only has one word per row, resulting in a DataFrame with many more rows than DataFrame storesDF. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.

A sample of storesDF is displayed below:

Code block:

storesDF.__1__(__2__, __3__(__4__(__5__)))

    Correct Answer: C

    The correct method to transform the column 'productCategories' which contains lists of products into individual rows for each product while retaining the corresponding 'storeId' is to use the withColumn method combined with the explode function. The explode function is designed to transform each element in the array into a separate row. Therefore, the code block should use withColumn to create a new column 'productCategory', and then apply the explode function with col('productCategories') to split each product into a separate row.

Discussion
5effea7Option: E

The answer is E. If you feel like getting in an argument with the question on the proper use of plural field names, then pick C.

deadbeef38Option: C

new column should be "productCategory" singular

deadbeef38

ok, that wasn't in the spec, but it should have been. I guess E is ok then.

Sowwy1Option: E

E. 1. withColumn 2. "productCategories" 3. explode 4. col 5. "productCategories"