Certified Machine Learning Professional Exam - Question 39

Question

A machine learning engineering team has written predictions computed in a batch job to a Delta table for querying. However, the team has noticed that the querying is running slowly. The team has already tuned the size of the data files. Upon investigating, the team has concluded that the rows meeting the query condition are sparsely located throughout each of the data files.

Based on the scenario, which of the following optimization techniques could speed up the query by colocating similar records while considering values in multiple columns?

Examice · Accepted Answer

Z-Ordering is a technique used in Delta Lake to optimize the layout of data to improve query performance by colocating similar records. It assists in ensuring that rows meeting the query conditions are located close together, thereby speeding up the query. This approach is beneficial when dealing with high-dimensional data and considering values in multiple columns.

BokNinja · Answer

The correct answer is A. Z-Ordering.

Z-Ordering is a technique used in Delta Lake to optimize the layout of data to improve query performance. It’s a multi-dimensional clustering technique that colocates related information in the same set of files. This colocation can significantly improve the speed of queries and analytics, especially when dealing with high-dimensional data. By using Z-Ordering, the team can ensure that rows meeting the query condition are located close together, thereby speeding up the query.

hugodscarvalho · Answer

Z-Ordering is a technique used in Delta Lake to colocate similar records together based on the values of multiple columns. This optimization improves query performance by reducing the amount of data that needs to be scanned to satisfy a query, particularly when filtering on multiple columns.

Joy999 · Answer

"The team has already tuned the size of the data files" - mentioned
So E is OUT

Certified Machine Learning Professional Exam - Question 39

Discussion