Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 44

The code block shown below should create a single-column DataFrame from Python list years which is made up of integers. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.

Code block:

_1_._2_(_3_, _4_)

    Correct Answer: E

    The correct way to create a single-column DataFrame from a Python list made up of integers in PySpark is by using the DataFrame API with the function spark.createDataFrame. Since 'years' is already a list, it should be passed directly along with the specified data type IntegerType() to ensure each element in the list is interpreted as an integer. The correct method call is spark.createDataFrame(years, IntegerType()).

Discussion
peekaboo15Option: E

The answer should be E because Year is already a python list.

IndieeOption: E

Two responses 1. D is an error. E will split the array into rows 2. spark.createDataFrame([arraryVar_name],ArrayType(IntegerType())) will store the whole array as a row

znetsOption: E

E is the most suitable, but it also contains an error. In PySpark, the correct class name for the integer data type is IntegerType (not "IntegertType").

mahmoud_salah30Option: E

e is the right

juliom6Option: E

E is correct: from pyspark.sql.types import IntegerType years = [2023, 2024] print(type(years)) storesDF = spark.createDataFrame(years, IntegerType()) storesDF.show() <class 'list'> +-----+ |value| +-----+ | 2023| | 2024| +-----+

juadavesOption: D

D from pyspark.sql.types import IntegerType spark.createDataFrame([1991,2023],IntegerType()).show() +-----+ |value| +-----+ | 1991| | 2023| +-----+

carlosmps

it's E. years is already a list

thanabOption: E

1. spark 2. createDataFrame 3. years 4. IntegertType()

cookiemonster42Option: E

if years is variable, it works, just tested it: years = [1, 3, 4, 5 , 9] df7 = spark.createDataFrame(years, IntegerType()) df7.show() this works as well: df7 = spark.createDataFrame([1, 3, 4, 5 , 9], IntegerType()) df7.show() this won't work: df7 = spark.createDataFrame([years], IntegerType()) df7.show() so, the answer is E

singh100Option: E

E. D is giving an error .

zozoshankyOption: E

D throws a big error. /usr/local/spark/python/pyspark/sql/types.py in verify_acceptable_types(obj) 1291 # subclass of them can not be fromInternal in JVM 1292 if type(obj) not in _acceptable_types[_type]: -> 1293 raise TypeError(new_msg("%s can not accept object %r in type %s" 1294 % (dataType, obj, type(obj)))) 1295 TypeError: field value: IntegerType can not accept object [1, 2, 3, 4, 5] in type <class 'list'> E is correct answer from pyspark.sql.types import IntegerType a = [1,2,3,4,5] spark.createDataFrame(a, IntegerType()).show()

Indiee

Agreed