DEA-C01 Exam QuestionsBrowse all questions from this exam

DEA-C01 Exam - Question 84


A company uses Amazon EMR as an extract, transform, and load (ETL) pipeline to transform data that comes from multiple sources. A data engineer must orchestrate the pipeline to maximize performance.

Which AWS service will meet this requirement MOST cost effectively?

Show Answer
Correct Answer: C

A company that uses Amazon EMR for running an ETL pipeline benefits from using AWS Step Functions to orchestrate the pipeline. AWS Step Functions offers greater flexibility and broader integration capabilities compared to other options. It can seamlessly manage the workflow of various AWS services, including EMR, making it a highly suitable and cost-effective choice for this scenario. AWS Glue Workflows, while powerful, are more specifically tailored for AWS Glue ETL jobs and might not be as flexible or efficient for managing an EMR-based ETL pipeline.

Discussion

6 comments
Sign in to comment
artworkadOption: C
Jun 14, 2024

Glue Workflows is for Glue job orchestration. C is for orchestration with different AWS services.

tgvOption: C
Jun 15, 2024

While AWS Glue Workflows are excellent for orchestrating Glue-specific ETL tasks, AWS Step Functions is more suitable for orchestrating an Amazon EMR-based ETL pipeline due to its greater flexibility, broader integration capabilities, and effective cost management. Therefore, the correct choice remains [C]

HunkyBunkyOption: C
Jun 20, 2024

C - becuase AWS Glue can be used only for glue based ETL jobs

bakarysOption: D
Jul 2, 2024

The most cost-effective AWS service for orchestrating an ETL pipeline that maximizes performance is D. AWS Glue Workflows. AWS Glue is a fully managed ETL service that makes it easy to move data between your data stores. AWS Glue simplifies and automates the difficult and time-consuming tasks of data discovery, conversion mapping, and job scheduling. AWS Glue Workflows allows you to orchestrate complex ETL jobs involving multiple crawlers, jobs, and triggers. While the other services mentioned (Amazon EventBridge, Amazon MWAA, and AWS Step Functions) can be used for workflow orchestration, they are not specifically designed for ETL workloads and may not be as cost-effective for this use case. AWS Glue is designed for ETL workloads, and its workflows feature is specifically designed for orchestrating ETL jobs, making it the most suitable and cost-effective choice.

LR2023Option: B
Jul 16, 2024

https://aws.amazon.com/blogs/big-data/build-a-concurrent-data-orchestration-pipeline-using-amazon-emr-and-apache-livy/

androloginOption: C
Jul 16, 2024

This is EMR not Glue workflows hence step functions EventBridge is best for event driven architecture