DEA-C01 Exam QuestionsBrowse all questions from this exam

DEA-C01 Exam - Question 39


A company is migrating its database servers from Amazon EC2 instances that run Microsoft SQL Server to Amazon RDS for Microsoft SQL Server DB instances. The company's analytics team must export large data elements every day until the migration is complete. The data elements are the result of SQL joins across multiple tables. The data must be in Apache Parquet format. The analytics team must store the data in Amazon S3.

Which solution will meet these requirements in the MOST operationally efficient way?

Show Answer
Correct Answer: C

The most operationally efficient solution involves leveraging AWS managed services to minimize custom code and manual interventions. Creating a view in the EC2 instance-based SQL Server databases simplifies the extraction process. Using an AWS Glue crawler to read the view automates the discovery and cataloging of the data. An AWS Glue job is suitable for performing the ETL process, including data transformation to Parquet format and transferring it to an S3 bucket. Scheduling the AWS Glue job ensures regular data export without manual intervention, making this solution efficient and scalable.

Discussion

7 comments
Sign in to comment
Christina666Option: C
Apr 13, 2024

Leveraging SQL Views: Creating a view on the source database simplifies the data extraction process and keeps your SQL logic centralized. Glue Crawler Efficiency: Using a Glue crawler to automatically discover and catalog the view's metadata reduces manual setup. Glue Job for ETL: A dedicated Glue job is well-suited for the data transformation (to Parquet) and loading into S3. Glue jobs offer built-in scheduling capabilities. Operational Efficiency: This approach minimizes custom code and leverages native AWS services for data movement and cataloging.

taka5094Option: C
Mar 19, 2024

Choice A) is almost the same approach, but it doesn't use the AWS Glue crawler, so have to manage the view's metadata manually.

GiorgioGssOption: C
Mar 19, 2024

Just beacuse it decouples the whole architecture I will go with C

evntdrvn76
Feb 3, 2024

A. Create a view in the EC2 instance-based SQL Server databases that contains the required data elements. Create an AWS Glue job that selects the data directly from the view and transfers the data in Parquet format to an S3 bucket. Schedule the AWS Glue job to run every day. This solution is operationally efficient for exporting data in the required format.

rralucard_Option: A
Feb 4, 2024

Option A (Creating a view in the EC2 instance-based SQL Server databases and creating an AWS Glue job that selects data from the view, transfers it in Parquet format to S3, and schedules the job to run every day) seems to be the most operationally efficient solution. It leverages AWS Glue’s ETL capabilities for direct data extraction and transformation, minimizes manual steps, and effectively automates the process.

Felix_G
Mar 2, 2024

Option C seems to be the most operationally efficient: It leverages Glue for both schema discovery (via the crawler) and data transfer (via the Glue job). The Glue job can directly handle the Parquet format conversion. Scheduling the Glue job ensures regular data export without manual intervention.

helpaws
Mar 16, 2024

you're right: https://aws.amazon.com/blogs/big-data/extracting-multidimensional-data-from-microsoft-sql-server-analysis-services-using-aws-glue/

taka5094
Mar 19, 2024

Is this right? https://aws.amazon.com/jp/blogs/big-data/extracting-multidimensional-data-from-microsoft-sql-server-analysis-services-using-aws-glue/

bakarysOption: A
Jul 1, 2024

Option A involves creating a view in the EC2 instance-based SQL Server databases that contains the required data elements. An AWS Glue job is then created to select the data directly from the view and transfer the data in Parquet format to an S3 bucket. This job is scheduled to run every day. This approach is operationally efficient as it leverages managed services (AWS Glue) and does not require additional transformation steps. Option D involves creating an AWS Lambda function that queries the EC2 instance-based databases using JDBC. The Lambda function is configured to retrieve the required data, transform the data into Parquet format, and transfer the data into an S3 bucket. This approach could work, but managing and scheduling Lambda functions could add operational overhead compared to using managed services like AWS Glue.