Exam Certified Data Engineer Professional All QuestionsBrowse all questions from this exam
Question 90

Which statement regarding Spark configuration on the Databricks platform is true?

    Correct Answer: D

    On the Databricks platform, Spark configuration properties set using the Clusters UI will apply to the entire interactive cluster, impacting all notebooks attached to that cluster. This is because configurations at the cluster level ensure that consistent settings are applied across all SparkSessions within the cluster, providing a uniform environment for all running notebooks.

Discussion
alexvnoOption: D

A wrong, cluster will restart -> D

hamzaKhribiOption: D

I tried it myself, setting a spark conf on the cluster ui, will impact all notebooks attached to that cluster, for example i set the number of shuffle partitions to 4, and in every notebook when i inspect the number of partitions i find 4.

vctrhugoOption: D

These settings are applied at the cluster level and affect all SparkSessions on the cluster.

Curious76Option: B

A. Incorrect: Modifying configurations through the Databricks REST API while jobs are running can lead to unexpected behavior or disruption. It's generally not recommended. C. Incorrect: While global init scripts can be used, it's not the only way. Configurations can also be set within notebooks. D. Incorrect: Configurations set through the Clusters UI apply to the entire cluster, but they might not necessarily override configurations set within notebooks attached to the cluster. E. Incorrect: Notebook configurations can take precedence over cluster-level configurations for the same property, offering finer-grained control at the notebook level.

petrvOption: A

In Databricks, you can use the Databricks REST API to modify Spark configuration properties for an interactive cluster without interrupting currently running jobs. This allows you to dynamically adjust Spark configurations to optimize performance or meet specific requirements without the need to restart the cluster.

petrv

If you update the configuration of a cluster using the Databricks REST API or the Clusters UI while the cluster is in a RUNNING state, the cluster will be restarted to apply the new configuration. However, Databricks typically handles this situation in a way that minimizes disruption to running jobs.

alexvno

wrong, cluster will restart