DP-200 Exam - Question 1

Question

You are a data engineer implementing a lambda architecture on Microsoft Azure. You use an open-source big data solution to collect, process, and maintain data.

The analytical data store performs poorly.

You must implement a solution that meets the following requirements:

✑ Provide data warehousing

✑ Reduce ongoing management activities

✑ Deliver SQL query responses in less than one second

You need to create an HDInsight cluster to meet the requirements.

Which type of cluster should you create?

Examice · Accepted Answer

D
Lambda Architecture with Azure:Azure offers you a combination of following technologies to accelerate real-time big data analytics:1. Azure Cosmos DB, a globally distributed and multi-model database service. 2. Apache Spark for Azure HDInsight, a processing framework that runs large-scale data analytics applications. 3. Azure Cosmos DB change feed, which streams new data to the batch layer for HDInsight to process. 4. The Spark to Azure Cosmos DB ConnectorNote: Lambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch processing and stream processing methods, and minimizing the latency involved in querying big data. References:https://sqlwithmanoj. com/2018/02/16/what-is-lambda-architecture-and-what-azure-offers-with-its-new-cosmos-db/.

dangal95 · Answer

Could the write answer be D becase Spark has;
1) Interactive queries through spark-sql
2) Datawarehousing capabilities through Delta Lake (and also spark-sql creates in memory tables)
3) Less management because these are out-of-the-box features?

jedi01 · Answer

I think the answer should be A. Interactive Query.
Here I am implementing Lambda architecture using a open source technology which can be Apache Spark and already in use. The prevailing issue here  Analytical Processing is very slow , in another words queries are slow. So I created an HDInsight Cluster of type "Interactive Query" to support Analytical processing/ fast query access, data warehousing etc. We can use HiveQL on Interactive Query. Refer to https://docs.microsoft.com/en-us/azure/hdinsight/interactive-query/apache-interactive-query-get-started

J_i_L_L · Answer

Exam was updated on Nov 24, 2020.  Didn't see too many questions from the test on ExamTopics...maybe 20-30% of the test questions. Suggest waiting a bit to take the test so that all the exam prep questions are updated.  Exam definitely requires hands-on knowledge of the products.  A lot of questions on CosmosDB consistency settings, encryption/security, monitoring/metrics.

dumpsm42 · Answer

hi to all,

answer:

https://azure.microsoft.com/pt-pt/blog/general-availability-of-hdinsight-interactive-query-blazing-fast-data-warehouse-style-queries-on-hyper-scale-data-2/
sub-second !

Summary
This week at Ignite, we are pleased to announce general availability of Azure HDInsight Interactive Query. Backed by our enterprise-grade SLA, HDInsight Interactive Query brings sub-second speed to data warehouse style SQL queries to the hyper-scale data stored in commodity cloud storage.

regards

dfrp92 · Answer

How does Spark meet the requirements? Spark does not provide data warehousing by itself, it is not a data store.

Mittun · Answer

Apache Spark is correct Answer !!
https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview

uomer · Answer

I also vote for Interactive Query as "An Interactive Query cluster is different from an Apache Hadoop cluster. It contains only the Hive service.
Requirements: 
✑ Provide data warehousing ( Yes) 
✑ Reduce ongoing management activities (Not sure)
✑ Deliver SQL query responses in less than one second ( Yes)

Leonido · Answer

Would suggest to use the original link from MS: https://docs.microsoft.com/en-us/azure/cosmos-db/lambda-architecture as better background documentation

AAJ · Answer

https://docs.microsoft.com/en-us/azure/cosmos-db/lambda-architecture

agmadeira · Answer

A - Interactive Query  - "Deliver SQL query responses in less than one second"
https://docs.microsoft.com/en-us/azure/hdinsight/interactive-query/apache-interactive-query-get-started

r8d1 · Answer

i think the logic to answer this question is:
Lambda architecture:
https://databricks.com/glossary/lambda-architecture
Azure implementation:
https://azure.microsoft.com/en-us/services/databricks/
Azure Databricks = Fast, easy, and collaborative Apache SparkTM based analytics service

nehab0101 · Answer

https://azure.microsoft.com/en-in/blog/lambda-architecture-using-azure-cosmosdb-faster-performance-low-tco-low-devops/

sunil08 · Answer

D: Apache spark

Trivender · Answer

Correct Answer in Spark because it is in memory

Satya217 · Answer

https://docs.microsoft.com/en-us/azure/cosmos-db/lambda-architectur

sandeep1111 · Answer

correct

Hinzzz · Answer

D is correct based on Data warehousing requirement.

DP-200 Exam - Question 1

Discussion