Exam Certified Data Engineer Professional All QuestionsBrowse all questions from this exam
Question 94

Spill occurs as a result of executing various wide transformations. However, diagnosing spill requires one to proactively look for key indicators.

Where in the Spark UI are two of the primary indicators that a partition is spilling to disk?

    Correct Answer: B

    The Stage’s detail screen provides key metrics about each stage of a job, including the amount of data that has been spilled to disk. High numbers in the “Spill (Memory)” or “Spill (Disk)” columns are indicators of spill. Additionally, the Executor’s log files contain messages like “Spilling UnsafeExternalSorter to disk” or “Task memory spill”, which indicate that a partition is spilling to disk because the task ran out of memory.

Discussion
60tiesOption: B

B is correct

vctrhugoOption: B

In the Spark UI, the Stage’s detail screen provides key metrics about each stage of a job, including the amount of data that has been spilled to disk. If you see a high number in the “Spill (Memory)” or “Spill (Disk)” columns, it’s an indication that a partition is spilling to disk. The Executor’s log files can also provide valuable information about spill. If a task is spilling a lot of data, you’ll see messages in the logs like “Spilling UnsafeExternalSorter to disk” or “Task memory spill”. These messages indicate that the task ran out of memory and had to spill data to disk.

jin1991Option: E

E is correct

jin1991

My bad, looking again, B is correct.