Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 114

Which of the following code blocks returns a DataFrame containing only the rows from DataFrame storesDF where the value in column sqft is less than or equal to 25,000 AND the value in column customerSatisfaction is greater than or equal to 30?

    Correct Answer: D

    The correct code block to filter the DataFrame where the value in column sqft is less than or equal to 25,000 and the value in column customerSatisfaction is greater than or equal to 30 is 'storesDF.filter(col("sqft") <= 25000 & col("customerSatisfaction") >= 30)'. In PySpark, '&' is used for logical AND operations, not 'and'.

Discussion
deadbeef38Option: A

A is right

Sowwy1Option: D

It's D: https://sparkbyexamples.com/spark/spark-and-or-not-operators/ PySpark Logical operations use the bitwise operators: & for and | for or ~ for not

sionitaOption: E

The answer should be E. In case of multiple conditions spark requires () such as: df.filter( (cond1) & (cond2) )

MSH_6Option: A

A is the right answer.

newusername

No, you do not use and but & in Pyspark D is correct