Question 6 of 120

A company created an extract, transform, and load (ETL) data pipeline in AWS Glue. A data engineer must crawl a table that is in Microsoft SQL Server. The data engineer needs to extract, transform, and load the output of the crawl to an Amazon S3 bucket. The data engineer also must orchestrate the data pipeline.

Which AWS service or feature will meet these requirements MOST cost-effectively?

    Correct Answer: B

    AWS Glue Workflows is specifically designed for Extract, Transform, Load (ETL) tasks within AWS and integrates seamlessly with data sources such as Microsoft SQL Server through built-in connectors. It is the most cost-effective choice as it provides a native and streamlined solution for orchestrating ETL jobs without the need for extra development or additional third-party tools.

Question 7 of 120

A financial services company stores financial data in Amazon Redshift. A data engineer wants to run real-time queries on the financial data to support a web-based trading application. The data engineer wants to run the queries from within the trading application.

Which solution will meet these requirements with the LEAST operational overhead?

    Correct Answer: B

    To run real-time queries on financial data stored in Amazon Redshift with the least operational overhead, the best solution is to use the Amazon Redshift Data API. This API allows you to execute SQL queries directly from within your trading application over HTTPS, which eliminates the need to manage and maintain complex connection setups such as WebSocket or JDBC. This significantly reduces the operational overhead and provides an efficient, scalable way to interact with your data in real-time.

Question 8 of 120

A company uses Amazon Athena for one-time queries against data that is in Amazon S3. The company has several use cases. The company must implement permission controls to separate query processes and access to query history among users, teams, and applications that are in the same AWS account.

Which solution will meet these requirements?

    Correct Answer: B

    Creating an Athena workgroup for each use case allows for isolation and management of different workloads, users, and permissions. This approach ensures that query processes and access to query history are separated effectively for each use case. By applying tags to the workgroups and creating IAM policies based on those tags, permissions can be managed in a more organized and simplified manner. This solution meets the requirements for permission controls across users, teams, and applications within the same AWS account.

Question 9 of 120

A data engineer needs to schedule a workflow that runs a set of AWS Glue jobs every day. The data engineer does not require the Glue jobs to run or finish at a specific time.

Which solution will run the Glue jobs in the MOST cost-effective way?

    Correct Answer: A

    Choosing the FLEX execution class in the Glue job properties leverages spare capacity within the AWS infrastructure to run Glue jobs at a discounted price compared to the STANDARD execution class. Since the data engineer does not require the jobs to run or finish at a specific time, utilizing this class is the most cost-effective solution. FLEX execution takes advantage of idle resources, thereby reducing costs significantly.

Question 10 of 120

A data engineer needs to create an AWS Lambda function that converts the format of data from .csv to Apache Parquet. The Lambda function must run only if a user uploads a .csv file to an Amazon S3 bucket.

Which solution will meet these requirements with the LEAST operational overhead?

    Correct Answer: A

    To configure an AWS Lambda function to process .csv files uploaded to an S3 bucket with the least operational overhead, you should create an S3 event notification that triggers when an object is created (i.e., s3:ObjectCreated:*). You can apply a filter rule to ensure the notification is generated only when the suffix includes .csv. This setup automatically invokes the Lambda function directly whenever a .csv file is uploaded, removing the need for additional services such as Amazon SNS, thus reducing operational complexity and overhead.