Exam DEA-C01 All QuestionsBrowse all questions from this exam
Question 92

A company has developed several AWS Glue extract, transform, and load (ETL) jobs to validate and transform data from Amazon S3. The ETL jobs load the data into Amazon RDS for MySQL in batches once every day. The ETL jobs use a DynamicFrame to read the S3 data.

The ETL jobs currently process all the data that is in the S3 bucket. However, the company wants the jobs to process only the daily incremental data.

Which solution will meet this requirement with the LEAST coding effort?

    Correct Answer: B

    To process only the daily incremental data with the least coding effort, enabling job bookmarks for the ETL jobs is the most appropriate solution. AWS Glue job bookmarks automatically keep track of the data that has been processed in previous runs, allowing the ETL jobs to ignore previously processed data and only handle new data. This feature is specifically designed for incremental data processing and does not require additional coding or complex setups like the other options.

Discussion
tgvOption: B

AWS Glue job bookmarks are designed to handle incremental data processing by automatically tracking the state.

bakarysOption: B

The solution that will meet this requirement with the least coding effort is Option B: Enable job bookmarks for the ETL jobs to update the state after a run to keep track of previously processed data. AWS Glue job bookmarks help ETL jobs to keep track of data that has already been processed during previous runs. By enabling job bookmarks, the ETL jobs can skip the processed data and only process the new, incremental data. This feature is designed specifically for this use case and requires minimal coding effort. Options A, C, and D would require additional coding and operational effort. Option A would require creating a new ETL job and managing a DynamoDB table. Option C would involve setting up job metrics and CloudWatch, which doesn’t directly address processing incremental data. Option D would involve deleting data from S3 after processing, which might not be desirable if the original data needs to be retained. Therefore, Option B is the most suitable solution.

androloginOption: B

AWS Glue Bookmarks can be used to pin where the data processing last stopped hence help with incremental processing.

HunkyBunkyOption: B

B - bookmarks is a key