DP-201 Exam QuestionsBrowse all questions from this exam

DP-201 Exam - Question 206


What should you recommend as a batch processing solution for Health Interface?

Show Answer
Correct Answer: C

Azure Data Factory is an ideal solution for batch processing, particularly when dealing with a variety of data sources and formats. It can ingest, prepare, and transform data on a large scale. Azure Data Factory integrates with the Azure Cosmos DB bulk executor library to offer high performance when writing to Azure Cosmos DB. This makes it well-suited to support a scalable batch processing solution, aligning with the requirements to efficiently add data from new hospitals.

Discussion

36 comments
Sign in to comment
Amitkhanna
Mar 16, 2020

How come batch processing using Azure Stream analytics why not Azure data bricks? seems like wrong answer this should be D.

maciejt
Apr 13, 2021

it's actually ADF as per their explanation, they marked it wrong. Bricks would also do I guess, there's little that ADF can do that databricks can't, if anything.

maciejt
Apr 13, 2021

ok, ADF can use copy data from on-premise source, spark, which is used by ADF data fows and data bricks can't do that

maciejt
Apr 13, 2021

ok, ADF can use copy data from on-premise source, spark, which is used by ADF data fows and data bricks can't do that

Luke97
Apr 9, 2020

Technology choices for batch processing are 1. Azure Synapse Analytics 2. Azure HDInsight 3. Azure Data Lake Analytics 4. Azure Databricks https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

rajneesharora
Mar 7, 2021

ADF has Data Flows, why is ADF not listed as part of Batch Processing? Secondly, changing the Units, will scale ADF as well... Sending data from On-Premise cant be done via DataBricks, DataBricks can act on it once data is in Azure, ADF seems to be the option

krisspark
Aug 2, 2020

most of the questions and discussions in DP-201 are so confusing.. not sure which answer is correct unless having subject knowledge

bansal_vikrant
Apr 13, 2020

ADF should be used for batch processing. Ans should be C

Kashan_Ali
Sep 12, 2020

If you check this link "https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing", ADF is not an option answer is D(Azure data bricks).

Shailbakshi
Mar 18, 2020

should be D

Leonido
May 2, 2020

Stream Analytics is a streaming solution, not a batch processing solution. Data factory is an orchestration solution with data copy capabilities. Have no idea what the Azure Cycle thingy is. So Databricks is the only solution here qualified as Batch processing solution.

Luke97
Apr 30, 2020

It require "Support a more scalable batch processing solution in Azure". So Databricks is the only auto-autoscaling option. (https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing)

runningman
May 1, 2020

the comment below the 'answer' suggests the answer should be ADF, not the highlighted answer 'B'. But ADF is not really a batch processing solution, per the MS docs (as Luke97 clearly references).

Loai
Apr 24, 2020

"Reduce the amount of time it takes to add data" = Real-Time that means the answer is Azure Stream Analytics, so the answer is correct B

runningman
May 1, 2020

i don't think 'reduce the time it takes to add data' means make it real time... it just means speed it up! (the data load was getting slower, per the case study.) I think the answer should be databricks.

mohowzeh
Jan 14, 2021

The more reactions I read, the more confused I get. My 2 cents: in this case, the hospitals send the data in batch. This means not message-by-message, but a file containing several messages or records. Most of the discussion here looks at "batch processing", which is another story to do with analysing big data stored in files. To me, batch processing is not the correct context of this case. What we need is to ingest files coming from the hospital from time to time. Azure Data Factory seems right to me. The answer's comment also seems to point to this solution, so the answer itself might be a typo.

Carmina
May 12, 2020

I also agree that it should be Databricks

envy
Jul 13, 2020

if input is cosmos DB, it should be data factory. as Azure stream analysis only support event hub, IOT hub and Blog storage as input. And the provided explain also mentioned a link: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db that use data factory to connect to cosmos db

Porus
Aug 26, 2020

Adf vs Bricks which would be ideal? as its batch I feel its should be databricks

avix
Sep 5, 2020

With ADF you can add a notebook from ADB. With ADF you can do batch processing and moreover both ADF and ADB has underlying architecture of apache spark. Performance wise both are almost same...the requirement can be achieved by both but ADF is less in terms of coding compared to ADB

groy
Oct 1, 2020

Azure Databricks

Shrikant_Kulkarni
Nov 2, 2020

They mentioned, health interface application received data in batches (group of messages as batch from existing c# application). If ADF is answer how solution is expecting to receive data (http source / json files on blob store?) with varying schema and perform bulk insert into cosmodb? It has to be ADB receiving messages / batches on stream and ingesting them into cosmodb.

sandGrain
Nov 5, 2020

The answer should be D: Databricks. Purely because of Scalability factor. ADF can be used but Databricks is better when it comes to scaling.

maciejt
Apr 13, 2021

ADF can call databricks notebook in its pipeline

Johnnien
Dec 27, 2020

Which product would provide the best performance?

AmolRajmane
Feb 18, 2021

Don't go by word "batch". read this: Health Interface - ADatum has a critical application named Health Interface that receives hospital messages related to patient care and status updates. So stream analytics seems to be correct.

Qrm_1972
May 24, 2021

Correct Answer: B Explanation/Reference: Explanation: Scenario: ADatum identifies the following requirements for the Health Interface application: Support a more scalable batch processing solution in Azure. Reduce the amount of time it takes to add data from new hospitals to Health Interface. Data Factory integrates with the Azure Cosmos DB bulk executor library to provide the best performance when you write to Azure Cosmos DB. Reference: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db

PowerBIRangerGuru
May 26, 2021

Why is it that no body is choosing Azure stream analytics as the input of the processing solution is messages generated by the website.

remz
Jun 8, 2020

DataBricks dont support C#, Analytics is Correct

MLCL
Jul 4, 2020

The C# is deprecated and will be removed.

hecaci8196
Jun 13, 2020

clearly Azure Databricks https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

SidN
Jun 17, 2020

ADF makes more sense here as the requirement is to load the data from branches to target DB (Most likely Cosmos DB). Databrcks is more for bigdata analytics processing.

Abhilvs
Jun 22, 2020

According to the given info, for the Health interface, cosmos DB is appropriate storage solutions. it's been stated the messages are sent in batches, in that case, Stream analytics is the best bet here as it can stream messages directly to Cosmos DB Sink. The given answer is right

SidN
Jun 24, 2020

Databricks is more for big data analytics. In this case batch processing is needed to load data into Cosmos DB. So ADF makes more sense.

MLurgi
Aug 6, 2020

I think the answer should be ADF. Eventhough it is not a batch processing solution per sè, if you have a look on the documentation link it also refers to ADF. However, Databricks would also sound plausible here in my opinion since it is the only "real" designated batch processing solution. I would stick with ADF but also think Databricks would be plausible. Azure Stream Analytics just does not make any sense at all here

Vj57
Sep 14, 2020

The question says "More scalable batch processing" so if you refer the link only 'Azure Databricks' is scalable from the list. So this should be the answer https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing

syu31svc
Dec 8, 2020

"Minimize the number of services required to perform data processing, development, scheduling, monitoring, and the operationalizing of pipelines." I would pick Data Factory as the answer

syu31svc
Dec 14, 2020

Disregard this; Databricks for batch processing

BungyTex
Dec 10, 2020

It has B showing as the answer, but then the description underneath implies C where it talks about data Factory and Cosmos DB. Data Factory is scalable.

Johnnien
Dec 27, 2020

Can I use ADF only for solution of both Health Insights and Health Interface?

AlexD332
Mar 10, 2021

https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing it seems Databricks

maciejt
Apr 13, 2021

Not sure if databricks can access on prem data source. If yes, then no question D. If not, then you have to use ADF copy data activity to opy from on prem to staging. But as different hospitals have different data formats then you have to transform it to common format. ADF can use mappng data flow or call databricks notebook to do that (but only from staged data already in Azure). dataflow unfortunately is not auto scalable, you have to redefine how many cores you want to use, so I would call databricks notebook from ADF after copy data in ADF. Cosest anwer seems C - ADF.

davita8
Apr 30, 2021

D. Azure Databricks

dbdev
May 16, 2021

I would choose ADF. https://devblogs.microsoft.com/cosmosdb/migrating-relational-data-into-cosmos-db-using-azure-data-factory-and-azure-databricks/

massnonn
Nov 16, 2021

for batch processing is databricks