DP-203 Exam - Question 78

Question

You have an Azure Synapse Analytics dedicated SQL pool.

You plan to create a fact table named Table1 that will contain a clustered columnstore index.

You need to optimize data compression and query performance for Table1.

What is the minimum number of rows that Table1 should contain before you create partitions?

Examice · Accepted Answer

To optimize data compression and query performance for a clustered columnstore index in Azure Synapse Analytics, the recommended approach is to have at least 1 million rows per partition. Given that Azure Synapse automatically uses 60 distributions for a table, this translates to a minimum of 60 million rows before creating additional partitions. Partitioning helps in achieving better data compression and query performance.

Ankit_Az · Answer

Clustered Column Store will by default have 60 partitions. And to achieve best compression we need at least 1 Million rows per partition, hence Option D 60 Millions (1M per partition)

Deeksha1234 · Answer

should be D

58d2382 · Answer

Question says "What is the minimum number of rows that Table1 should contain before you create (add/new/extra) partitions?"
As per microsoft documentation, each partition will contain 1Million records. So, if there atleast 1million records, we can go for partitioning.

Here is the link for documentation
https://learn.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-overview?view=sql-server-ver16

s_unsworth · Answer

Cluster columnstore tables begin to achieve optimal compression once there is more than 60 million rows. For small lookup tables, less than 60 million rows, consider using HEAP or clustered index for faster query performance. -- Microsoft

Lukis92 · Answer

To achieve optimal data compression and query performance with clustered columnstore tables in Azure Synapse Analytics, it is recommended to have a minimum of 1 million rows per distribution and partition.
As Synapse Analytics automatically creates 60 distributions per table, to fulfill the 1 million rows per distribution recommendation, the table should ideally contain 60 million rows if no additional partitions are created.
However, the question is asking about the threshold for creating partitions, not necessarily a table of full 60 million rows. Therefore, you would want to ensure you have at least 1 million rows in each partition to maintain the optimal performance and compression. If the number of rows is less than 1 million, it's better to consider fewer partitions in order to increase the number of rows per partition.

akhil5432 · Answer

1 MILLION

akhil5432 · Answer

WHY People mentioned option D..please explain how?

kkk5566 · Answer

is correct

hassexat · Answer

60 million

6d954df · Answer

60m, see https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition

Charley92 · Answer

To optimize data compression and query performance for Table in Azure Synapse Analytics dedicated SQL pool, you should create partitions when the table contains at least 60 million rows.

Partitioning tables can improve query performance by reducing the amount of data that needs to be scanned. It can also improve data compression by allowing each partition to be compressed separately.

In general, you should consider partitioning a table when it contains a large amount of data and queries frequently filter on a specific column or set of columns

Siva_Jsn_23 · Answer

1 Million (Option C) is correct. You need the minimum number of rows to create a optimized partition. A single optimal partition requires 1 Million rows.

blazy001 · Answer

A  100mil  is correct >>
From the answers to this Q, I see that MS has done a bad job because people don't understand what distributions or partitions are. My explanation: Each table with column store index is auto divided into 60 distributions, on each of these distributions there is auto 1 partition. For good performance (with column store) each partition must have at least 1Mil rows.
The question was: "What is the minimum number of rows that Table1 should contain before you create (add/new/extra) partitions?"
So there is no point in creating partitions with 60M rows,
because then you divide this into 0.5Mil per partition. At least 120Mil would be ideal, but 100Mil already starts.

MBRSDG · Answer

even without knowing how many partitions you're going to create, you know that each partition should contain at least 1million rows. 60million rows are the only case enabling to use partitions.

KarlGardnerDataEngineering · Answer

I think the correct answer should be 120 million rows. Since splitting the data up into 2 partitions would result in 1 million rows per distribution and partition for 120 million rows

AA9292 · Answer

it is 1 Million. the question is clearly asking for minimum number of rows before creating partitions. and in Microsoft document they stated that 1 million rows is the minimum number of rows before partitioning a table

jaco_0930 · Answer

1M * 60 = 60M

DP-203 Exam - Question 78

Discussion