SnowPro Advanced Architect Exam - Question 14

Question

The table contains five columns and it has millions of records. The cardinality distribution of the columns is shown below:

Column C4 and C5 are mostly used by SELECT queries in the GROUP BY and ORDER BY clauses. Whereas columns C1, C2 and C3 are heavily used in filter and join conditions of SELECT queries.

The Architect must design a clustering key for this table to improve the query performance.

Based on Snowflake recommendations, how should the clustering key columns be ordered while defining the multi-column clustering key?

Examice · Accepted Answer

Based on the Snowflake recommendations, when defining a multi-column clustering key, columns should be ordered from the lowest cardinality to the highest cardinality. This is because clustering with lower cardinality columns first will make the clustering more effective. Columns C1, C2, and C3 are used heavily in filter and join conditions, so these should be included in the clustering key, and they should be ordered by cardinality. Therefore, the order should be C2 (which has a cardinality of 108), C1 (with a cardinality of 10,790), and C3 (with a cardinality of 302,605). Thus, option D correctly orders the columns from lowest to highest cardinality.

Jay_98_11 · Answer

correct

hillcat111 · Answer

Answer is D and is validated

jlad26 · Answer

If you are defining a multi-column clustering key for a table, the order in which the columns are specified in the CLUSTER BY clause is important. As a general rule, Snowflake recommends ordering the columns from lowest cardinality to highest cardinality. Putting a higher cardinality column before a lower cardinality column will generally reduce the effectiveness of clustering on the latter column.

Balakt · Answer

C4 and C5 are ruled out. C2, C1 and C3 are correct (Based on the increasing cardinality)

caio_tertuliano · Answer

Question D. Saw this exactly on the actual exam.

SnowPro Advanced Architect Exam - Question 14

Discussion