DP-201 Exam - Question 206

Question

What should you recommend as a batch processing solution for Health Interface?

Examice · Accepted Answer

Azure Data Factory is an ideal solution for batch processing, particularly when dealing with a variety of data sources and formats. It can ingest, prepare, and transform data on a large scale. Azure Data Factory integrates with the Azure Cosmos DB bulk executor library to offer high performance when writing to Azure Cosmos DB. This makes it well-suited to support a scalable batch processing solution, aligning with the requirements to efficiently add data from new hospitals.

Amitkhanna · Answer

How come batch processing using Azure Stream analytics why not Azure data bricks? seems like wrong answer this should be D.

Luke97 · Answer

Technology choices for batch processing are
1. Azure Synapse Analytics
2. Azure HDInsight
3. Azure Data Lake Analytics
4. Azure Databricks
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

krisspark · Answer

most of the questions and discussions in DP-201 are so confusing.. not sure which answer is correct unless having subject knowledge

bansal_vikrant · Answer

ADF should be used for batch processing. Ans should be C

Shailbakshi · Answer

should be D

Leonido · Answer

Stream Analytics is a streaming solution, not a batch processing solution. Data factory is an orchestration solution with data copy capabilities. Have no idea what the Azure Cycle thingy is. So Databricks is the only solution here qualified as Batch processing solution.

Luke97 · Answer

It require "Support a more scalable batch processing solution in Azure". So Databricks is the only auto-autoscaling option. (https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing)

runningman · Answer

the comment below the 'answer' suggests the answer should be ADF, not the highlighted answer 'B'.   But ADF is not really a batch processing solution, per the MS docs (as Luke97 clearly references).

Loai · Answer

"Reduce the amount of time it takes to add data" = Real-Time that means the answer is Azure Stream Analytics, so the answer is correct B

mohowzeh · Answer

The more reactions I read, the more confused I get. My 2 cents: in this case, the hospitals send the data in batch. This means not message-by-message, but a file containing several messages or records. Most of the discussion here looks at "batch processing", which is another story to do with analysing big data stored in files. To me, batch processing is not the correct context of this case. What we need is to ingest files coming from the hospital from time to time. Azure Data Factory seems right to me. The answer's comment also seems to point to this solution, so the answer itself might be a typo.

Carmina · Answer

I also agree that it should be Databricks

envy · Answer

if input is cosmos DB, it should be data factory. as Azure stream analysis only support event hub, IOT hub and Blog storage as input. And the provided explain also mentioned a link: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db that use data factory to connect to cosmos db

Porus · Answer

Adf vs Bricks which would  be ideal? as its batch I feel its should be databricks

avix · Answer

With ADF you can add a notebook from ADB. With ADF you can do batch processing and moreover both ADF and ADB has underlying architecture of apache spark. Performance wise both are almost same...the requirement can be achieved by both but ADF  is less in terms of coding compared to ADB

groy · Answer

Azure Databricks

Shrikant_Kulkarni · Answer

They mentioned, health interface application received data in batches (group of messages as batch from existing c# application). If ADF is answer how solution is expecting to receive data (http source / json files on blob store?) with varying schema and perform bulk insert into cosmodb?  It has to be ADB receiving messages / batches on stream and ingesting them into cosmodb.

sandGrain · Answer

The answer should be D: Databricks. Purely because of Scalability factor. ADF can be used but Databricks is better when it comes to scaling.

Johnnien · Answer

Which product would provide the best performance?

AmolRajmane · Answer

Don't go by word "batch". read this: 
Health Interface -
ADatum has a critical application named Health Interface that receives hospital messages related to patient care and status updates. So stream analytics seems to be correct.

Qrm_1972 · Answer

Correct Answer: B
Explanation/Reference:
Explanation:
Scenario: ADatum identifies the following requirements for the Health Interface application:
Support a more scalable batch processing solution in Azure.
Reduce the amount of time it takes to add data from new hospitals to Health Interface.
Data Factory integrates with the Azure Cosmos DB bulk executor library to provide the best performance when you write to Azure Cosmos DB.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db

PowerBIRangerGuru · Answer

Why is it that no body is choosing Azure stream analytics as the input of the processing solution is messages generated by the website.

remz · Answer

DataBricks dont support C#, Analytics is Correct

hecaci8196 · Answer

clearly Azure Databricks
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

SidN · Answer

ADF makes more sense here as  the requirement is to load the  data  from branches to  target DB (Most likely Cosmos DB).  Databrcks is more for bigdata  analytics processing.

Abhilvs · Answer

According to the given info, for the Health interface, cosmos DB is appropriate storage solutions. it's been stated the messages are sent in batches, in that case, Stream analytics is the best bet here as it can stream messages directly to Cosmos DB Sink. The given answer is right

SidN · Answer

Databricks is more for big data analytics. In this case  batch processing is needed to load data into Cosmos DB. So ADF makes more sense.

MLurgi · Answer

I think the answer should be ADF. Eventhough it is not a batch processing solution per sè, if you have a look on the documentation link it also refers to ADF.

However, Databricks would also sound plausible here in my opinion since it is the only "real" designated batch processing solution.

I would stick with ADF but also think Databricks would be plausible.

Azure Stream Analytics just does not make any sense at all here

Vj57 · Answer

The question says "More scalable batch processing" so if you refer the link only 'Azure Databricks' is scalable from the list. So this should be the answer
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

syu31svc · Answer

"Minimize the number of services required to perform data processing, development, scheduling, monitoring, and the operationalizing of pipelines."
I would pick Data Factory as the answer

BungyTex · Answer

It has B showing as the answer, but then the description underneath implies C where it talks about data Factory and Cosmos DB. Data Factory is scalable.

Johnnien · Answer

Can I use ADF only for solution of both Health Insights and Health Interface?

AlexD332 · Answer

https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing
it seems Databricks

maciejt · Answer

Not sure if databricks can access on prem data source. If yes, then no question D.
If not, then you have to use ADF copy data activity to opy from on prem to staging. But as different hospitals have different data formats then you have to transform it to common format. ADF can use mappng data flow or call databricks notebook to do that (but only from staged data already in Azure). dataflow unfortunately is not auto scalable, you have to redefine how many cores you want to use, so I would call databricks notebook from ADF after copy data in ADF. Cosest anwer seems C - ADF.

davita8 · Answer

D. Azure Databricks

dbdev · Answer

I would choose ADF.
https://devblogs.microsoft.com/cosmosdb/migrating-relational-data-into-cosmos-db-using-azure-data-factory-and-azure-databricks/

massnonn · Answer

for batch processing is databricks

DP-201 Exam - Question 206

Discussion