Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 66

Which of the following code blocks returns a DataFrame containing only the rows from DataFrame storesDF where the value in column sqft is less than or equal to 25,000 OR the value in column customerSatisfaction is greater than or equal to 30?

    Correct Answer: B

    To filter a DataFrame based on multiple conditions in PySpark, we use the `filter` method with column objects and logical operators. The correct syntax uses the `|` operator for 'or' and requires `col` to reference column names. Thus, the statement 'storesDF.filter(col('sqft') <= 25000 | col('customerSatisfaction') >= 30)' correctly applies the filter conditions using the appropriate syntax and logical operator.

Discussion
Akash567890978Option: B

I dont think even B is correct the conditions should be inside parenthesis as well