DP-201 Exam QuestionsBrowse all questions from this exam

DP-201 Exam - Question 94


HOTSPOT -

You design data engineering solutions for a company.

You must integrate on-premises SQL Server data into an Azure solution that performs Extract-Transform-Load (ETL) operations have the following requirements:

✑ Develop a pipeline that can integrate data and run notebooks.

✑ Develop notebooks to transform the data.

✑ Load the data into a massively parallel processing database for later analysis.

You need to recommend a solution.

What should you recommend? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Hot Area:

Exam DP-201 Question 94
Show Answer
Correct Answer:
Exam DP-201 Question 94

Discussion

11 comments
Sign in to comment
Needium
Mar 17, 2021

I would rather have Integrate on premises data to Cloud : ADF Develop notebooks to Transform Data : DataBricks Run Notebooks : ADF (Azure Databricks notebooks can be run within an ADF pipeline) Load the Data : Use ADF to load the Data Store the Transformed Data: Azure Synapse Analyses

maciejt
Apr 9, 2021

Exactly that was my take before seeing the solution.

cadio30
May 26, 2021

Azure databricks can handle the loading of data from the notebook to the external tables of Azure Synapse unless the requirement is explicitly to export the file to another storage then use of ADF is the appropriate

Wendy_DK
May 14, 2021

Given answer is right. Remember requirement: Load the data into a massively parallel processing database for later analysis. ADF and Batch can work together. ref: https://docs.microsoft.com/en-us/azure/data-factory/v1/data-factory-data-processing-using-batch

BobFar
Jun 6, 2021

I am agree with you.

VG2007
May 3, 2021

Given Solution is correct.. no confusions.. why anyone will use ADB to develop notebook and then use ADF to run them unless it is specifically specified ?

Larrave
Nov 27, 2021

Because they were asking for a Data Engineering solution and having everything handled within one orchestration/etl tool makes definitely sense.

H_S
Mar 15, 2021

azure data factory could be used to load the data too

Geo_Barros
Mar 15, 2021

Regarding loading the data, I think Azure Data Factory could also be an appropriate answer.

davita8
Apr 29, 2021

Load the data - Azure data factory transformed data-azure sql data warehouse

Ous01
May 27, 2021

Why note using Databricks to load the data? When the notebook finishes the process, it also can load the data into Synapse. Databricks can easily uploads results to Synapse, Azure SQL, and Azure Cosmos DB.

aditya_064
Apr 25, 2021

Shouldn't Load the data (Box 4) be Azure Synapse Analytics ? It's the only one with a MPP engine, which is exactly what is mentioned in the question

Bhagya123456
Aug 19, 2021

Given Solution is 100% Correct. Do not confuse people with absurd arguments. I can do all the activities through Synapse Analysis also. That doesn't mean I will choose 5 times Synapse Analyses.

maciejt
Apr 9, 2021

Why Azure Batch is better than ADF to load data? ADF could be used to: Integrate from on-prem to azure, invoke notebook (developed in data bricks), then load data into warehouse, all within one pipeline.

BobFar
Jun 6, 2021

I guess for loading the data into a massively parallel processing database , azure data batch is the better solution. https://docs.microsoft.com/en-us/azure/data-factory/v1/data-factory-data-processing-using-batch

Anonymous
Jun 27, 2021

Just one change Run notebook is better done from ADF as we can orchestrate the sequence better. When run from databricks, it may not know the time of data retrieveal and also the next step, Azure Batch cannot be called from ADB