Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 157

Which of the following code blocks creates a Python UDF assessPerformanceUDF() using the integer-returning Python function assessPerformance() and applies it to Column customerSatisfaction in DataFrame storesDF?

    Correct Answer: B

    In order to create a Python UDF, we need to use the udf function from the pyspark.sql.functions module. The syntax is udf(assessPerformance, IntegerType()), where assessPerformance is the Python function and IntegerType() specifies the return type of the UDF. Once we have defined the UDF, we can apply it to the DataFrame column using withColumn and col functions. Therefore, the correct code block to create and apply the UDF is assessPerformanceUDF = udf(assessPerformance, IntegerType()) storesDF.withColumn('result', assessPerformanceUDF(col('customerSatisfaction'))).

Discussion
Sowwy1Option: B

B. assessPerformanceUDF = udf(assessPerformance, IntegerType()) storesDF.withColumn("result", assessPerformanceUDF(col("customerSatisfaction")))