Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 43

The code block shown below contains an error. The code block is intended to use SQL to return a new DataFrame containing column storeId and column managerName from a table created from DataFrame storesDF. Identify the error.

Code block:

storesDF.createOrReplaceTempView("stores")

storesDF.sql("SELECT storeId, managerName FROM stores")

    Correct Answer: B

    The sql() function is not a method of the DataFrame object. It is a method of the SparkSession object spark. The correct way to execute a SQL statement using Spark SQL is to call sql() on the SparkSession object as follows: spark.sql('SELECT storeId, managerName FROM stores').

Discussion
4be8126Option: B

Option B is correct because the sql() function is not a method of a DataFrame object. It is actually a method of the SparkSession object spark. Therefore, the correct way to execute a SQL statement using Spark SQL is to call sql() on the SparkSession object as follows: spark.sql("SELECT storeId, managerName FROM stores") In the code block provided in the question, sql() is called on a DataFrame object, which will result in a DataFrame object without executing the SQL statement. Therefore, option B correctly identifies the error in the code block.

juliom6Option: B

B is correct: storesDF = spark.createDataFrame([('1', 'juan'), ('2', 'perez')], ['storeId', 'managerName']) storesDF.createOrReplaceTempView("stores") spark.sql("SELECT storeId, managerName FROM stores").show()