Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 134

Which of the following code blocks returns a new DataFrame where column division is the first two characters of column division in DataFrame storesDF?

    Correct Answer: B

    To extract the first two characters of a column in a DataFrame, the correct method is to use the substr function. The starting index for substr function in PySpark is 1-based. Therefore, to get the first two characters, the start index should be 1 and the length should be 2. So, 'storesDF.withColumn("division", susbtr(col("division"), 1, 2))' is correct because it correctly specifies the start index as 1 and the length as 2.

Discussion
JuanitoFMOption: D

it should be D, the first two characters should be from 0-2

iamadoctorOption: E

E is right

amirshazOption: D

it should be D, the first two characters should be from 0-2

5cf7aceOption: D

D is right

azure_bimonsterOption: D

D would be correct as it asks first two characters. - substr(startIndex: Int, length: Int): This function takes two arguments: -> startIndex: The starting index of the substring to extract. Indexing starts from 0. -> length: The length of the substring to extract.