Exam Certified Associate Developer for Apache Spark All QuestionsBrowse all questions from this exam
Question 110

Which of the following pairs of arguments cannot be used in DataFrame.join() to perform an inner join on two DataFrames, named and aliased with "a" and "b" respectively, to specify two key columns column1 and column2?

    Correct Answer: B

    To perform an inner join using two key columns in DataFrame.join(), you typically specify the columns either by using column expressions or simply by providing a sequence of column names. However, when using the 'usingColumns' argument, it should be a sequence of column names, not column expressions. Therefore, specifying 'usingColumns = Seq(col(“column1”), col(“column2”))' is incorrect. Correct usage would be 'usingColumns = Seq(“column1”, “column2”)'. This makes option B incorrect and thus the answer.

Discussion
Alucard069Option: D

Will option D work? I have never seen this way of accessing columns in a dataframe : df("column") , it should be either df.column or col("a.column") [ considering a as an alias of df]

Sowwy1Option: B

I think it s B