Exam Certified Data Engineer Professional All QuestionsBrowse all questions from this exam
Question 50

A production cluster has 3 executor nodes and uses the same virtual machine type for the driver and executor.

When evaluating the Ganglia Metrics for this cluster, which indicator would signal a bottleneck caused by code executing on the driver?

    Correct Answer: E

    A bottleneck caused by code executing on the driver would be indicated by overall cluster CPU utilization being around 25%. This suggests that the driver node is likely overburdened, as it is consuming most of its CPU resources while the executor nodes are underutilized. In a properly balanced cluster, the CPU utilization should be spread more evenly across the nodes. If the driver is the bottleneck, it would prevent the executors from being effectively utilized, hence the low overall CPU utilization.

Discussion
BrianNguyen95Option: E

Option E: In a Spark cluster, the driver node is responsible for managing the execution of the Spark application, including scheduling tasks, managing the execution plan, and interacting with the cluster manager. If the overall cluster CPU utilization is low (e.g., around 25%), it may indicate that the driver node is not utilizing the available resources effectively and might be a bottleneck.

guillesd

Overall CPU utilization can be misleading. The 25% utilization could be caused by the workload not requiring more than that rather than everything being executed in the driver node.

sturcuOption: D

If the overall cluster CPU utilization is around 25%, it means that only one out of the four nodes (driver + 3 executors) is using its full CPU capacity, while the other three nodes are idle or underutilized

sturcu

Correct Answer is E.

sturcuOption: E

If the overall cluster CPU utilization is around 25%, it means that only one out of the four nodes (driver + 3 executors) is using its full CPU capacity, while the other three nodes are idle or underutilized

PatitoOption: D

D seems to be right

azurelearn2020Option: E

25% indicates Cluster CPU under-utilized

Def21

Not correct. 25% could (in theory) mean driver is using 100% CPU

lophonosOption: E

E is correct

guillesdOption: D

If there's no IO between driver and executor nodes then the executor nodes are not working

rok21Option: E

E is correct