Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 60


In what order should the below lines of code be run in order to read a JSON file at the file path filePath into a DataFrame with the specified schema schema?

Lines of code:

1. .json(filePath, schema = schema)

2. .storesDF

3. .spark \

4. .read() \

5. .read \

6. .json(filePath, format = schema)

Show Answer
Correct Answer: CE

To read a JSON file into a DataFrame with a specified schema using Spark, you first need to initiate the Spark session using .spark, then use the read() method to create a DataFrameReader object, and finally, use the json method to read the file with the specified schema. Therefore, the lines of code should be run in the order 3. .spark, 4. .read(), 1. .json(filePath, schema = schema).

Discussion

4 comments
Sign in to comment
ZSunOption: C
Jun 7, 2023

storesDF = spark.read.json(filePath, schema = schema) C

juliom6Option: C
Nov 14, 2023

C is correct: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrameReader.json.html json function does not have a "format" parameter.

JticOption: B
Jun 1, 2023

2. .storesDF: This line is unrelated to reading the JSON file and can be disregarded. .read(): This line invokes the DataFrameReader's read() method to create a DataFrameReader object. .json(filePath, schema=schema): This line uses the DataFrameReader object to read the JSON file at the specified filePath into a DataFrame with the provided schema.

pnev
Jan 2, 2024

This is so wrong.. in order to read a table you need to use spark.read.json / parquet / Table.

azure_bimonsterOption: C
Feb 8, 2024

we use the following structure: spark.read.json(filePath, schema=schemaName)