What are the recommended steps to address poor SQL query performance due to data spilling? (Choose two.)
What are the recommended steps to address poor SQL query performance due to data spilling? (Choose two.)
To address poor SQL query performance due to data spilling, it's important to optimize the query for efficiency. Fetching only the required attributes reduces the amount of data processed, which can mitigate spilling. Additionally, using a larger virtual warehouse provides more memory and local disk space, which can alleviate the effects of data spilling. These steps directly address the root causes of data spilling and improve query performance.
CD correct https://community.snowflake.com/s/article/Performance-impact-from-local-and-remote-disk-spilling
its BD. Snowflake does not encourage increasing the warehouse size if something can be done with the existing query. The link which you gave also talk about "projecting only the columns that are needed in the output"
The spilling can't always be avoided, especially for large batches of data, but it can be decreased by: Reviewing the query for query optimization especially if it is a new query. Reducing the amount of data processed. For example, by trying to improve partition pruning, or projecting only the columns that are needed in the output. Decreasing the number of parallel queries running in the warehouse. Trying to split the processing into several steps (for example by replacing the CTEs with temporary tables). Using a larger warehouse. This effectively means more memory and more local disk space. https://community.snowflake.com/s/article/Performance-impact-from-local-and-remote-disk-spilling
The spilling can't always be avoided, especially for large batches of data, but it can be decreased by: Reviewing the query for query optimization especially if it is a new query Reducing the amount of data processed. For example, by trying to improve partition pruning, or projecting only the columns that are needed in the output. Decreasing the number of parallel queries running in the warehouse. Trying to split the processing into several steps (for example by replacing the CTEs with temporary tables). Using a larger warehouse - this effectively means more memory and more local disk space.
Answer is C and D https://docs.snowflake.com/en/user-guide/ui-query-profile For some operations (e.g. duplicate elimination for a huge data set), the amount of memory available for the compute resources used to execute the operation might not be sufficient to hold intermediate results. As a result, the query processing engine will start spilling the data to local disk. If the local disk space is not sufficient, the spilled data is then saved to remote disks. This spilling can have a profound effect on query performance (especially if remote disk is used for spilling). To alleviate this, we recommend: Using a larger warehouse (effectively increasing the available memory/local disk space for the operation), and/or Processing data in smaller batches.
It's B and C
Actually I think there are 3 possible options. However, only B and C are explicitly mentioned in docs (as shown by basdas comment) Moreover, I think processing data in smaller batches can be just unhandy and requiring more work than other solutions. Thus I eventually vote for BC