SnowPro Advanced Data Engineer Exam

A company built a sales reporting system with Python, connecting to Snowflake using the Python Connector. Based on the user's selections, the system generates the SQL queries needed to fetch the data for the report. First it gets the customers that meet the given query parameters (on average 1000 customer records for each report run), and then it loops the customer records sequentially. Inside that loop it runs the generated SQL clause for the current customer to get the detailed data for that customer number from the sales data table.

When the Data Engineer tested the individual SQL clauses, they were fast enough (1 second to get the customers, 0.5 second to get the sales data for one customer), but the total runtime of the report is too long.

How can this situation be improved?

Correct Answer: D

The total runtime of the report is too long because the system loops through each customer record sequentially and runs the generated SQL clause for each customer to get the detailed sales data. Given that there are on average 1000 customer records and it takes 0.5 seconds to get the sales data for one customer, the sequential processing results in a significant cumulative delay. By rewriting the report to eliminate the use of the loop construct, the system can generate a single, more complex SQL query that retrieves all the necessary sales data in one go. This will significantly reduce the runtime from a cumulative approach to a parallel processing approach, leveraging the efficiency and optimization capabilities of the database system.