Exam DEA-C01 All QuestionsBrowse all questions from this exam
Question 94

A retail company stores transactions, store locations, and customer information tables in four reserved ra3.4xlarge Amazon Redshift cluster nodes. All three tables use even table distribution.

The company updates the store location table only once or twice every few years.

A data engineer notices that Redshift queues are slowing down because the whole store location table is constantly being broadcast to all four compute nodes for most queries. The data engineer wants to speed up the query performance by minimizing the broadcasting of the store location table.

Which solution will meet these requirements in the MOST cost-effective way?

    Correct Answer: A

    The most effective solution to improve query performance would be to change the distribution style of the store location table from EVEN distribution to ALL distribution. Using ALL distribution means that the entire table is replicated to every compute node in the Amazon Redshift cluster. This significantly reduces the overhead associated with broadcasting the table during queries. Given that the store location table is updated very infrequently, the overhead of maintaining these copies will be minimal. Thus, this change will improve performance without incurring significant additional costs.

Discussion
PGGuyOption: A

Changing the distribution style of the store location table to ALL distribution (A) is the most cost-effective solution. It directly addresses the issue of broadcasting by ensuring the entire table is available on each node, significantly improving join performance without incurring substantial additional costs.

androloginOption: A

ALL distribution is optimal for slowly changing dimension tables and generally small in size to allow for optimal joins.

bakarysOption: A

The most cost-effective solution to speed up the query performance by minimizing the broadcasting of the store location table would be: A. Change the distribution style of the store location table from EVEN distribution to ALL distribution. In Amazon Redshift, the ALL distribution style replicates the entire table to all nodes in the cluster, which eliminates the need to redistribute the data when executing a query. This can significantly improve query performance. Given that the store location table is updated only once or twice every few years, the overhead of maintaining the replicated data would be minimal. This makes it a cost-effective solution for improving the query performance.

tgvOption: A

Using ALL distribution means the table is replicated to all nodes, eliminating the need for broadcasting during queries. Since the store location table is updated infrequently, this will significantly speed up queries without incurring frequent update costs.