AZ-305 Exam QuestionsBrowse all questions from this exam

AZ-305 Exam - Question 249


You have an Azure subscription that contains an Azure Cosmos DB for NoSQL account named account1 and an Azure Synapse Analytics workspace named Workspace1. The account1 account contains a container named Contained that has the analytical store enabled.

You need to recommend a solution that will process the data stored in Contained in near-real-time (NRT) and output the results to a data warehouse in Workspace1 by using a runtime engine in the workspace. The solution must minimize data movement.

Which pool in Workspace1 should you use?

Show Answer
Correct Answer: A

To process data stored in the Cosmos DB for NoSQL account's container in near-real-time (NRT) and output the results to a data warehouse in Workspace1, you should use the Apache Spark pool in Azure Synapse Analytics. Apache Spark is a distributed processing framework that can handle streaming data efficiently, which aligns with the need for near-real-time processing. Moreover, it can directly access data stored in Azure Cosmos DB's analytical store, minimizing data movement. This ensures that data can be processed and outputted to the data warehouse quickly and with reduced latency, making it the most suitable choice for the given requirements.

Discussion

14 comments
Sign in to comment
KeyManOption: B
Jan 6, 2024

B. Serverless SQL pool Reasoning: Serverless SQL pool in Azure Synapse Analytics is designed to handle on-demand queries against large datasets, which is suitable for the NRT processing requirement stated. Minimal Data Movement: Using serverless SQL pool allows querying data in place without the need to move data into the pool, which aligns with the need to minimize data movement. It can directly query the Cosmos DB analytical store. Integration with Cosmos DB Analytical Store: Serverless SQL pool has built-in integration with Azure Cosmos DB's analytical store, allowing efficient and performant processing of the data. Apache Spark could also process the data, but it would involve more data movement compared to serverless SQL. Dedicated SQL pool requires pre-provisioned resources and wouldn't be as cost-effective for NRT scenarios. Data Explorer is not a compute pool within Azure Synapse Analytics.

deegadaze1
Jan 13, 2024

NO! When to use Azure Synapse Data Explorer? Use Data Explorer as a data platform for building near real-time log analytics and IoT analytics solutions to: Consolidate and correlate your logs and events data across on-premises, cloud, and third-party data sources. Accelerate your AI Ops journey (pattern recognition, anomaly detection, forecasting, and more). Replace infrastructure-based log search solutions to save cost and increase productivity. Build IoT analytics solutions for your IoT data. Build analytics SaaS solutions to offer services to your internal and external customers. Azure Data Explorer is a fully managed, high-performance, big data analytics platform that makes it easy to analyze high volumes of data in near real time. The Azure Data Explorer toolbox gives you an end-to-end solution for data ingestion, query, visualization, and management.

deegadaze1
Jan 13, 2024

https://learn.microsoft.com/en-us/azure/data-explorer/data-explorer-overview https://learn.microsoft.com/en-us/azure/synapse-analytics/data-explorer/data-explorer-overview

Fidel_104
Mar 7, 2024

Further supporting B - serverless SQL pool, the Azure Synapse Link guide for Cosmos DB also recommends serverless pools for the real-time operational reporting use-cases: Source: https://learn.microsoft.com/en-us/azure/cosmos-db/synapse-link-use-cases

masetromainOption: A
Mar 28, 2024

Apache Spark is a distributed processing framework that can handle near-real-time processing and is well-integrated with Azure Synapse Analytics. It can directly access data stored in Azure Cosmos DB analytical store without needing to move the data around. This minimizes data movement and provides efficient processing capabilities. So, the correct answer is: A. Apache Spark

MohsenSicOption: A
Mar 24, 2024

I go with A: Two reasons: Synapse had Apache Spark, Dat explore is mainly for logs, refer to the bottom flowchart of the below link https://learn.microsoft.com/en-us/azure/data-explorer/data-explorer-overview

TarasShevaOption: A
May 13, 2024

A. Apache Spark Near-Real-Time (NRT) Processing: Apache Spark provides capabilities for real-time stream processing, which aligns with the requirement for near-real-time processing. Integration with Azure Cosmos DB: Apache Spark has built-in connectors and libraries for integrating with Azure Cosmos DB, allowing for seamless data ingestion and processing without significant data movement. Output to Data Warehouse: Apache Spark can easily output processed data to various destinations, including data warehouses like Azure Synapse Analytics. It can write directly into dedicated SQL pools or serverless SQL pools within the Synapse workspace. Minimizing Data Movement: Since Apache Spark can directly access data in Azure Cosmos DB and write results to the data warehouse within the same Azure environment, it minimizes data movement, thus optimizing performance and reducing costs.

ApponOption: D
Feb 23, 2024

because of "near-real-time"

azurewormOption: A
Mar 25, 2024

A is the correct answer https://learn.microsoft.com/en-us/azure/cosmos-db/synapse-link-use-cases

LGWJ12Option: A
Apr 4, 2024

A: Apache Spark,it's in Azure Synapse Analytics is an analytics engine that facilitates large-scale data processing. It can read data from Cosmos DB in near-real-time, process it, and then output the results to a data warehouse in the same Azure Synapse Analytics workspace. This minimizes data movement as the data processing and storage are happening within the same service (Azure Synapse Analytics).

23169fdOption: A
Jun 26, 2024

Apache Spark: Spark pools in Azure Synapse Analytics provide a distributed data processing framework capable of processing large volumes of data in near-real-time. Spark is highly efficient in handling streaming data and can directly read from Azure Cosmos DB's analytical store with minimal data movement, making it an ideal choice for near-real-time processing. Serverless SQL and Dedicated SQL: While these can be used for querying and processing data, they are not as optimized for near-real-time processing as Apache Spark. Additionally, they typically involve more data movement compared to Spark's direct processing capabilities. Data Explorer: This is typically used for fast ad-hoc data exploration and querying, particularly for log and telemetry data, rather than for continuous near-real-time data processing and transformation.

Frank_2022Option: C
Mar 5, 2024

I recommend using a dedicated SQL pool Near-real-time processing: Dedicated SQL pools are specifically designed for low-latency analytical workloads, making them ideal for processing data in near-real-time. Data minimization: Dedicated SQL pools are integrated with Workspace1, allowing for seamless data movement between the Cosmos DB analytical store and the data warehouse within the same workspace. This minimizes data movement and avoids the need for external data transfer processes. Runtime engine: Dedicated SQL pools provide a T-SQL compatible query engine that can be used to interact with data stored in the data warehouse. This allows you to leverage familiar SQL syntax for data transformation and analysis.

ruminoOption: D
Mar 8, 2024

Azure Data Explorer is a fully managed, high-performance, big data analytics platform that makes it easy to analyze high volumes of data in near real time. The Azure Data Explorer toolbox gives you an end-to-end solution for data ingestion, query, visualization, and management. https://learn.microsoft.com/en-us/azure/data-explorer/data-explorer-overview https://learn.microsoft.com/en-us/azure/synapse-analytics/data-explorer/data-explorer-overview

Frank_2022
Mar 10, 2024

Data Explorer is a powerful tool for querying data in Synapse Workspace, it's not designed for real-time data processing. I believe.

Felas
Mar 23, 2024

Azure Data Explorer is a fast, fully managed data analytics service for analyzing large volumes of streaming data from applications, websites, IoT devices, etc. in real time. https://azure.microsoft.com/es-es/products/data-explorer

Frank_2022Option: C
Mar 13, 2024

Dedicated SQL pools are specifically designed for low-latency analytical workloads, making them ideal for processing data in near-real-time.

varinder82Option: D
Mar 17, 2024

Final Answer : D

varinder82Option: D
Apr 7, 2024

Final Answer : D

moadabdouOption: A
Jul 14, 2024

For processing data stored in the 'Contained' container of Cosmos DB in near-real-time (NRT) and outputting results to a data warehouse in Workspace1, leveraging an Apache Spark pool within Azure Synapse Analytics is highly recommended. This approach is particularly effective due to Apache Spark's robust in-memory processing capabilities, which can handle large volumes of data swiftly. Additionally, by utilizing Azure Synapse Link for seamless integration with Cosmos DB's analytical store, this solution ensures minimal data movement. This not only enhances performance by enabling direct real-time data access but also optimizes resource utilization and reduces latency, making it an ideal setup for real-time data analytics.