Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 52


Which of the following operations can perform an outer join on two DataFrames?

Show Answer
Correct Answer: DE

The outer join operation on two DataFrames can be performed using the DataFrame.merge() method. This method allows for various types of joins, including outer joins, by using the 'how' parameter with the value 'outer'. The DataFrame.join() method also allows for outer joins, but DataFrame.merge() is specifically designed for merging DataFrames with options for different join types.

Discussion

3 comments
Sign in to comment
cookiemonster42Option: D
Jul 25, 2023

D. result_df = df1.join(df2, on="id", how="outer")

juliom6Option: D
Nov 14, 2023

D is correct. There is no exists outerJoin() operation in pyspark.

4be8126Option: C
May 3, 2023

The correct answer is C - DataFrame.outerJoin(). The outer join operation can be performed by specifying the join type as "outer" when calling the outerJoin() function on a DataFrame. The join() function in Spark only performs an inner join, while the merge() function is not a valid function in Spark SQL. The crossJoin() function performs a Cartesian product between two DataFrames, which is not an outer join.

ZSun
Jun 6, 2023

There is no outerjoin, bro! only dataframe.join(how='outer')

Seeker_thunder
Nov 19, 2023

this guy always post wrong answers, sometime gpts as well. ignore his commnmets

65bd33e
Apr 24, 2024

Wrong answer, check documentation