Certified Associate Developer for Apache Spark Exam QuestionsBrowse all questions from this exam

Certified Associate Developer for Apache Spark Exam - Question 93


Which of the following operations performs a cross join on two DataFrames?

Show Answer
Correct Answer: D

The correct option is DataFrame.crossJoin(). In PySpark or Apache Spark, the crossJoin() method of the DataFrame class is specifically designed to perform a cross join, which is a Cartesian product of two DataFrames. This means it pairs every row of the first DataFrame with every row of the second DataFrame. Other methods like DataFrame.join(), join(), and DataFrame.merge() do not perform a cross join but are used for different types of SQL joins (like inner join, left join, etc.). The correct option for performing a cross join is DataFrame.crossJoin().

Discussion

2 comments
Sign in to comment
Sowwy1Option: D
Apr 1, 2024

D. DataFrame.crossJoin()

f728f7fOption: D
Jul 7, 2024

D - https://api-docs.databricks.com/python/pyspark/latest/pyspark.sql/api/pyspark.sql.DataFrame.crossJoin.html