DP-203 Exam QuestionsBrowse all questions from this exam

DP-203 Exam - Question 78


You have an Azure Synapse Analytics dedicated SQL pool.

You plan to create a fact table named Table1 that will contain a clustered columnstore index.

You need to optimize data compression and query performance for Table1.

What is the minimum number of rows that Table1 should contain before you create partitions?

Show Answer
Correct Answer: A,D

To optimize data compression and query performance for a clustered columnstore index in Azure Synapse Analytics, the recommended approach is to have at least 1 million rows per partition. Given that Azure Synapse automatically uses 60 distributions for a table, this translates to a minimum of 60 million rows before creating additional partitions. Partitioning helps in achieving better data compression and query performance.

Discussion

17 comments
Sign in to comment
Ankit_AzOption: D
May 27, 2023

Clustered Column Store will by default have 60 partitions. And to achieve best compression we need at least 1 Million rows per partition, hence Option D 60 Millions (1M per partition)

Vanq69
Oct 4, 2023

You mean the dedicated SQL pool has 60 distributions "by default"?

Lscranio
Dec 5, 2023

60 Million is correct

AkosL
Mar 25, 2024

Not by default, but always

Deeksha1234Option: D
Aug 15, 2023

should be D

58d2382Option: A
Dec 29, 2023

Question says "What is the minimum number of rows that Table1 should contain before you create (add/new/extra) partitions?" As per microsoft documentation, each partition will contain 1Million records. So, if there atleast 1million records, we can go for partitioning. Here is the link for documentation https://learn.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-overview?view=sql-server-ver16

s_unsworthOption: D
Feb 28, 2024

Cluster columnstore tables begin to achieve optimal compression once there is more than 60 million rows. For small lookup tables, less than 60 million rows, consider using HEAP or clustered index for faster query performance. -- Microsoft

Lukis92Option: C
Jul 8, 2023

To achieve optimal data compression and query performance with clustered columnstore tables in Azure Synapse Analytics, it is recommended to have a minimum of 1 million rows per distribution and partition. As Synapse Analytics automatically creates 60 distributions per table, to fulfill the 1 million rows per distribution recommendation, the table should ideally contain 60 million rows if no additional partitions are created. However, the question is asking about the threshold for creating partitions, not necessarily a table of full 60 million rows. Therefore, you would want to ensure you have at least 1 million rows in each partition to maintain the optimal performance and compression. If the number of rows is less than 1 million, it's better to consider fewer partitions in order to increase the number of rows per partition.

tankwayep
Sep 5, 2023

Question is about the minimum number of rows in the table not in a partition. And according to what you explained, which correct, the answer is D.

akhil5432Option: C
Aug 7, 2023

1 MILLION

akhil5432Option: C
Aug 7, 2023

WHY People mentioned option D..please explain how?

kkk5566Option: D
Sep 4, 2023

is correct

hassexatOption: D
Sep 7, 2023

60 million

6d954dfOption: D
Dec 27, 2023

60m, see https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partition

Charley92Option: D
Jan 28, 2024

To optimize data compression and query performance for Table in Azure Synapse Analytics dedicated SQL pool, you should create partitions when the table contains at least 60 million rows. Partitioning tables can improve query performance by reducing the amount of data that needs to be scanned. It can also improve data compression by allowing each partition to be compressed separately. In general, you should consider partitioning a table when it contains a large amount of data and queries frequently filter on a specific column or set of columns

Siva_Jsn_23Option: C
Jun 14, 2024

1 Million (Option C) is correct. You need the minimum number of rows to create a optimized partition. A single optimal partition requires 1 Million rows.

blazy001Option: A
Dec 12, 2023

A 100mil is correct >> From the answers to this Q, I see that MS has done a bad job because people don't understand what distributions or partitions are. My explanation: Each table with column store index is auto divided into 60 distributions, on each of these distributions there is auto 1 partition. For good performance (with column store) each partition must have at least 1Mil rows. The question was: "What is the minimum number of rows that Table1 should contain before you create (add/new/extra) partitions?" So there is no point in creating partitions with 60M rows, because then you divide this into 0.5Mil per partition. At least 120Mil would be ideal, but 100Mil already starts.

MBRSDGOption: D
Apr 4, 2024

even without knowing how many partitions you're going to create, you know that each partition should contain at least 1million rows. 60million rows are the only case enabling to use partitions.

KarlGardnerDataEngineering
May 15, 2024

I think the correct answer should be 120 million rows. Since splitting the data up into 2 partitions would result in 1 million rows per distribution and partition for 120 million rows

AA9292Option: C
Jul 1, 2024

it is 1 Million. the question is clearly asking for minimum number of rows before creating partitions. and in Microsoft document they stated that 1 million rows is the minimum number of rows before partitioning a table

jaco_0930Option: D
Jul 22, 2024

1M * 60 = 60M