Microsoft 70-775 Exam Questions

Question 6 of 35

You have an Apache Spark cluster in Azure HDInsight.
You plan to join a large table and a lookup table.
You need to minimize data transfers during the join operation.
What should you do?

Use the reduceByKey function.

Use a Broadcast variable.

Repartition the data.

Use the DISK_ONLY storage level.

Correct Answer: B

Question 7 of 35

DRAG DROP -
You have an Apache Hive cluster in Azure HDInsight.
You need to tune a Hive query to meet the following requirements:
✑ Use the Tez engine.
✑ Process 1,024 rows in a batch.
How should you complete the query? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all.
You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:
Exam 70-775: Question 7 - Image 1

Correct Answer:

Question 8 of 35

You have an Apache Spark cluster in Azure HDInsight.
You execute the following command.
Exam 70-775: Question 8 - Image 1

What is the result of running the command?

the Hive ORC library is imported to Spark and external tables in ORC format are created

the Spark library is imported and the data is loaded to an Apache Hive table

the Hive ORC library is imported to Spark and the ORC-formatted data stored in Apache Hive tables becomes accessible

the Spark library is imported and Scala functions are executed

Correct Answer: C

Question 9 of 35

You use YARN to manage the resources for a Spark Thrift Server running on a Linux-based Apache Spark cluster in Azure HDInsight.
You discover that the cluster does not fully utilize the resources. You want to increase resource allocation.
You need to increase the number of executors and the allocation of memory to the Spark Thrift Server driver.
Which two parameters should you modify? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

spark.dynamicAllocation.maxExecutors

spark.cores.max

spark.executor.memory

spark_thrift_cmd_opts

spark.executor.instances

Correct Answer: A, C

Question 10 of 35

DRAG DROP -
You have a text file named Data/examples/product.txt that contains product information.
You need to create a new Apache Hive table, import the product information to the table, and then read the top 100 rows of the table.
Which four code segments should you use in sequence? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.
Select and Place:
Exam 70-775: Question 10 - Image 1

Correct Answer:

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext.sql("CREATE TABLE IF NOT EXISTS productid INT, productname STRING)" sqlContext.sql("LOAD DATA LOCAL INPATH Data/examples/product.txt INTO TABLE product") sqlContext.sql("SELECT productid, productname FROM product LIMIT 100").collect().foreach (println)
References: https://www.tutorialspoint.com/spark_sql/spark_sql_hive_tables.htm Exam 70-775: Question 10 - Image 2