Exam Certified Data Engineer Associate All QuestionsBrowse all questions from this exam
Question 82

A data engineering team has noticed that their Databricks SQL queries are running too slowly when they are submitted to a non-running SQL endpoint. The data engineering team wants this issue to be resolved.

Which of the following approaches can the team use to reduce the time it takes to return results in this scenario?

    Correct Answer: D

    When SQL queries are submitted to a non-running SQL endpoint, the primary delay comes from the time it takes for the endpoint to start up. Enabling the Serverless feature for the SQL endpoint significantly reduces this start-up time from minutes to seconds, thereby ensuring quicker query execution. This approach directly addresses the issue of slow query performance due to initial start-up delays.

Discussion
carpa_joOption: D

The important point of this scenario is "when they are submitted to a non-running SQL endpoint". So its not about increasing the instance size or the amount of instances to improve the query performance, but its about reducing the start-up time. A: Not possible, serverless can't be combined with spot instance policies, see https://docs.databricks.com/en/compute/sql-warehouse/serverless.html#limitations B: Auto Stop is about terminating a SQL warehouse after x minutes of being idle. C: Increasing the cluster size provides more capacities for running queries, but doesn't reduce start-up time. D: Serverless reduces start-up time from minutes to seconds. Jackpot! E: Increasing the max bound of the SQL endpoints scaling range will help with lots of sequencial queries, which is not the case here.

AndreFROption: D

key word, “non-running SQL endpoint” which implies that the query is slow because the cluster needs time to get started. I suggest answer D because : A : Serverless & spot instances cannot be mixed ? B : autostop means that jobs are submitted to non-running SQL endpoints C : increasing the clustersize can compensate for slow startup time D : serverless is able to start and scale faster than non-running SQL endpoints (seconds intead of minutes) E : increasing maximum bound will help only if there are simultaneous queries https://docs.gcp.databricks.com/en/lakehouse-architecture/cost-optimization/best-practices.html#use-serverless-for-your-workloads

olaruOption: E

maximum bound of the SQL endpoint's scaling range

nedloOption: C

D is wrong - its already Serverless (non running SQL endpoint) how would turning Serverless ON help? They also says C here https://community.databricks.com/t5/data-engineering/when-to-increase-maximum-bound-vs-when-to-increase-cluster-size/td-p/27880 . E is only true for autoscaling clusters

msenguptaOption: C

https://community.databricks.com/t5/data-engineering/sql-query-takes-too-long-to-run/td-p/21884

SydOption: E

Answer E: https://www.databricks.com/blog/2022/03/10/top-5-databricks-performance-tips.html

Syd

I mean answer C

azure_bimonsterOption: D

D is correct. Key phrase is "submitted to a non-running SQL endpoint". Increasing cluster size is not going to help if that's in a state like non-running.

bartftoOption: D

"when they are submitted to a non-running SQL endpoint" ANSWER D

GarynOption: C

C. They can increase the cluster size of the SQL endpoint. Explanation: Increasing the cluster size of the SQL endpoint can enhance query performance by providing more computational resources to execute queries. This can potentially speed up query processing by allowing more parallelism, handling larger workloads, and reducing the time taken for query execution.

meow_akkOption: E

Ans E : you re welcome :) https://community.databricks.com/t5/data-engineering/when-to-increase-maximum-bound-vs-when-to-increase-cluster-size/td-p/27880

mike_stewart

I don't agree. Your answer is only valid when 'sequential' is mentioned, which is not the case here.