Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 140


Which of the following code blocks uses SQL to return a new DataFrame containing column storeId and column managerName from a table created from DataFrame storesDF?

Show Answer
Correct Answer: D

To return a DataFrame containing specific columns using SQL in PySpark, you need to first create a temporary view of the original DataFrame, then execute an SQL query on that view. The code block in option D properly creates a temporary view named 'stores' from the DataFrame 'storesDF' using 'createOrReplaceTempView'. Then, it uses 'spark.sql' to execute the SQL query 'SELECT storeId, managerName FROM stores' on the temporary view. This will return a new DataFrame containing only the 'storeId' and 'managerName' columns.

Discussion

1 comment
Sign in to comment
Sowwy1Option: D
Apr 10, 2024

D is correct