Exam SnowPro Advanced Data Engineer All QuestionsBrowse all questions from this exam
Question 38

A Data Engineer is working on a continuous data pipeline which receives data from Amazon Kinesis Firehose and loads the data into a staging table which will later be used in the data transformation process. The average file size is 300-500 MB.

The Engineer needs to ensure that Snowpipe is performant while minimizing costs.

How can this be achieved?

    Correct Answer: D

    To ensure that Snowpipe is performant while minimizing costs, it is crucial to manage the file sizes effectively to optimize the loading process. By decreasing the buffer size to trigger delivery of files sized between 100 to 250 MB in Kinesis Firehose, you can manage the file size and ensure efficient and performant ingestion into Snowpipe, as smaller files lead to better parallel processing. This approach ensures that the file sizes are optimal for loading while minimizing resources and costs associated with data handling.

Discussion
Snow_POption: D

https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare

stopthisnowOption: D

Various tools can aggregate and batch data files. One convenient option is Amazon Kinesis Firehose. Firehose allows defining both the desired file size, called the buffer size, and the wait interval after which a new file is sent (to cloud storage in this case), called the buffer interval. For more information, see the Kinesis Firehose documentation. If your source application typically accumulates enough data within a minute to populate files larger than the recommended maximum for optimal parallel processing, you could decrease the buffer size to trigger delivery of smaller files. Keeping the buffer interval setting at 60 seconds (the minimum value) helps avoid creating too many files or increasing latency.