SnowPro Core Exam QuestionsBrowse all questions from this exam

SnowPro Core Exam - Question 1095


How should clustering be used to optimize the performance of queries that run on a very large table?

Show Answer
Correct Answer: C

To optimize the performance of queries on a very large table, you should use the column that is most-frequently used in the query's select clauses as the clustering key. This approach enhances the efficiency of data retrieval by ensuring that the data is organized in a manner that aligns with the query patterns, leading to faster filtering and more efficient use of indexes.

Discussion

10 comments
Sign in to comment
simus90Option: D
Apr 2, 2024

it s D

08c95ebOption: D
Jun 4, 2024

selective filters is different than select clause

joshguy40Option: D
Aug 30, 2024

its D select clause is the column you choose to select. We dont care about that. We care about the columns being filtered in the WHERE clause.

NachoPrendesOption: C
Apr 3, 2024

C https://docs.snowflake.com/en/user-guide/tables-clustering-keys#:~:text=Cluster%20columns%20that%20are%20most%20actively%20used%20in%20selective%20filters

induna
Apr 5, 2024

I think it is D, per the doc you listed: The number of distinct values (i.e. cardinality) in a column/expression is a critical aspect of selecting it as a clustering key. It is important to choose a clustering key that has: A large enough number of distinct values to enable effective pruning on the table. A small enough number of distinct values to allow Snowflake to effectively group rows in the same micro-partitions.

yaho5Option: C
Apr 10, 2024

C Snowflake recommends prioritizing keys in the order below: Cluster columns that are most actively used in selective filters. For many fact tables involved in date-based queries (for example “WHERE invoice_date > x AND invoice date <= y”), choosing the date column is a good idea. For event tables, event type might be a good choice, if there are a large number of different event types. (If your table has only a small number of different event types, then see the comments on cardinality below before choosing an event column as a clustering key.) If there is room for additional cluster keys, then consider columns frequently used in join predicates, for example “FROM table1 JOIN table2 ON table2.column_A = table1.column_B”.

Lematthew31Option: C
May 3, 2024

It's C : https://docs.snowflake.com/en/user-guide/tables-clustering-keys#strategies-for-selecting-clustering-keys "Selecting the right columns/expressions for a clustering key can dramatically impact query performance. Analysis of your workload will usually yield good clustering key candidates. Snowflake recommends prioritizing keys in the order below: Cluster columns that are most actively used in selective filters"

d22770a
Oct 18, 2024

SELECTIVE FILTER means WHERE clause, Option D talks SELECT column. So that is wrong

Jacobr5000Option: C
May 19, 2024

"Snowflake recommends prioritizing keys in the order below: Cluster columns that are most actively used in selective filters."

d22770aOption: D
Oct 18, 2024

D is correct

MatthieuDNOption: D
Jan 23, 2025

D is correct, it would've been C if the answer was WHERE clauses instead of SELECT

bor4unOption: D
Jan 25, 2025

Snowflake provides the SYSTEM$CLUSTERING_INFORMATION function to help you assess the effectiveness of clustering by evaluating the average depth of micro-partitions. The average depth indicates how well the data is clustered for the specified clustering key. If the depth is high, it suggests that data is not well-clustered, and you may need to refine the clustering key or re-cluster the table using the RECLUSTER operation. https://docs.snowflake.com/en/user-guide/tables-clustering-keys#what-is-a-clustering-key https://docs.snowflake.com/en/user-guide/tables-clustering-keys#calculating-the-clustering-information-for-a-table