DP-203 Exam QuestionsBrowse all questions from this exam

DP-203 Exam - Question 201


Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:

✑ A workload for data engineers who will use Python and SQL.

✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL.

✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R.

The enterprise architecture team at your company identifies the following standards for Databricks environments:

✑ The data engineers must share a cluster.

✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.

✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.

You need to create the Databricks clusters for the workloads.

Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs.

Does this meet the goal?

Show Answer
Correct Answer: AB

To meet the requirements, we need a High Concurrency cluster for the data engineers to allow efficient resource sharing. Data scientists require individual Standard clusters with auto-termination to support their ad hoc analysis. For the job workloads, using Standard clusters is appropriate as it supports running packages developed in Python, Scala, and SQL, ensuring compatibility with the provided languages. Therefore, setting up a Standard cluster for each data scientist, a High Concurrency cluster for data engineers, and a Standard cluster for jobs satisfactorily meets the stated goals.

Discussion

17 comments
Sign in to comment
AmalbenrebaiOption: B
Aug 30, 2021

- data engineers: high concurrency cluster - jobs: Standard cluster - data scientists: Standard cluster

Julius7000
Sep 17, 2021

Tell me one thing: is this answer 9jobs) based on the text: "A Single Node cluster has no workers and runs Spark jobs on the driver node. In contrast, a Standard cluster requires at least one Spark worker node in addition to the driver node to execute Spark jobs."? I dont understand the connection between worker noodes and the requirements given in the question about jobs workspace.

Aditya0891
Jun 7, 2022

single node cluster and standard cluster are different. In single node cluster you only have 1 node which act as driver and worker node while in standard cluster you can have separate driver and worker node and for jobs you can use standard or high concurrency cluster as well. So the requirements are satisfied here

Egocentric
Apr 17, 2022

agreed

supriyako
Sep 19, 2022

Correct. Because jobs could be for Scala notebook, which is supported by Standard cluster mode

gogosgh
May 5, 2023

The issue is the jobs are going to be ran by multiple users i.e. engineers and scientists? So it needs to be hugi concurrency cluster?

auwia
Jun 23, 2023

If you enable high concurrency then all scale scripts doesn't works, so scientists will stop to work). Standard cluster is scalable, will support all jobs and users! ;-)

gangstfearOption: A
Aug 31, 2021

The answer must be A!

HanseOption: B
Mar 10, 2022

As per Link: https://docs.azuredatabricks.net/clusters/configure.html Standard and Single Node clusters terminate automatically after 120 minutes by default. --> Data Scientists High Concurrency clusters do not terminate automatically by default. A Standard cluster is recommended for a single user. --> Standard for Data Scientists & High Concurrency for Data Engineers Standard clusters can run workloads developed in any language: Python, SQL, R, and Scala. High Concurrency clusters can run workloads developed in SQL, Python, and R. The performance and security of High Concurrency clusters is provided by running user code in separate processes, which is not possible in Scala. --> Jobs needs Standard

Ast999Option: A
Mar 3, 2023

SCALA = STANDARD

Eyepatch993Option: B
Mar 27, 2022

Standard clusters do not have fault tolerance. Both the data scientist and data engineers will be using the job cluster for processing their notebooks, so if a standard cluster is chosen and a fault occurs in the notebook of any one user, there is a chance that other notebooks might also fail. Due to this a high concurrency cluster is recommended for running jobs.

Boompiee
May 10, 2022

It may not be a best practice, but the question asked is: does the solution meet the stated requirements, and it does..

Aditya0891
Jun 11, 2022

Read the question properly. it states that each data scientist will have a standard cluster and a separate standard cluster for running jobs. So there is no question of fault due to other users. The answer is A

auwiaOption: A
Jun 23, 2023

High concurrence doesn't support scala.

allagowfOption: A
Oct 29, 2022

data scientists and Job --> Scala --> Standard cluster .

PallaviPatelOption: A
Jan 28, 2022

correct

ovokpusOption: A
Feb 24, 2022

Yes it seems to be!

sethuramanspOption: B
Jun 27, 2022

The answer should be "NO" as per the given statement "The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster." since the Job cluster is standard it will not allow data scientists and engineers to collectively deploy their Notebooks in standard cluster as it requires High Concurrency Cluster

wanchihh
Sep 14, 2023

High Concurrency Cluster does not support Scala.

AccountHatz
Feb 20, 2024

The Shared access mode clusters aka the former High Concurrency cluster do now support Scala , they do not support R (https://learn.microsoft.com/en-us/azure/databricks/archive/compute/cluster-ui-preview)

mav2000
Feb 21, 2024

Exactly! before, the cluster should've been Standard, because it wasn't able to support Scala, but now that it can, the best cluster is High concurrency from all the people executing jobs there.

dakku987Option: B
Jan 1, 2024

we need HC cluster for data engineer,data scientist,jobs

Deeksha1234Option: B
Jul 30, 2022

the answer should be No

Deeksha1234
Aug 15, 2022

sorry 'A' should be correct

anks84Option: A
Sep 7, 2022

We would need a Standard cluster for the jobs to support Scala. High-concurrecny cluster does not support Scala. Hence, the Answer is A !

greenleverOption: A
Oct 14, 2022

Correct

auwiaOption: A
Jun 23, 2023

True, correct.

kkk5566Option: A
Aug 30, 2023

correct - data engineers: high concurrency cluster - jobs: Standard cluster - data scientists: Standard cluster

DanweoOption: A
Jul 16, 2024

A is correct