The code block shown below contains an error. The code block intended to create a single-column DataFrame from Scala List years which is made up of integers. Identify the error.
Code block:
spark.createDataset(years)
The code block shown below contains an error. The code block intended to create a single-column DataFrame from Scala List years which is made up of integers. Identify the error.
Code block:
spark.createDataset(years)
To create a DataFrame from a Scala List of integers, the correct operation is createDataFrame, not createDataset. The code spark.createDataset(years) would not work because createDataset is used to create a Dataset, not a DataFrame. Therefore, the proper operation to use in this context is createDataFrame.
C is the answer
Since this is a scala question, the correct syntax would be : spark.createDataset(years).toDF("year") but that isn't one of the options
Official Databricks tests (where answer is A) Question 44 Which of the following code blocks creates a single-column DataFrame from Scala Listyears which is made up of integers? A. spark.createDataset(years).toDF B. spark.createDataFrame(years, IntegerType) C. spark.createDataset(years) D. spark.DataFrame(years, IntegerType) E. spark.createDataFrame(years)
Hence I'll go for D.
C. There is no operation createDataset – the createDataFrame operation should be used instead. The correct method to create a DataFrame in Spark using Scala is createDataFrame, not createDataset. The correct syntax would be: scala Copy code val df = spark.createDataFrame(years.map(Tuple1.apply)).toDF("columnName") This assumes that years is a List of integers, and the resulting DataFrame will have a single column named "columnName".