Which of the following code blocks returns a new DataFrame where column division is the first two characters of column division in DataFrame storesDF?
Which of the following code blocks returns a new DataFrame where column division is the first two characters of column division in DataFrame storesDF?
To extract the first two characters of a column in a DataFrame, the correct method is to use the substr function. The starting index for substr function in PySpark is 1-based. Therefore, to get the first two characters, the start index should be 1 and the length should be 2. So, 'storesDF.withColumn("division", susbtr(col("division"), 1, 2))' is correct because it correctly specifies the start index as 1 and the length as 2.
it should be D, the first two characters should be from 0-2
E is right
it should be D, the first two characters should be from 0-2
D is right
D would be correct as it asks first two characters. - substr(startIndex: Int, length: Int): This function takes two arguments: -> startIndex: The starting index of the substring to extract. Indexing starts from 0. -> length: The length of the substring to extract.