AWS Glue Workflows is specifically designed for Extract, Transform, Load (ETL) tasks within AWS and integrates seamlessly with data sources such as Microsoft SQL Server through built-in connectors. It is the most cost-effective choice as it provides a native and streamlined solution for orchestrating ETL jobs without the need for extra development or additional third-party tools.
To run real-time queries on financial data stored in Amazon Redshift with the least operational overhead, the best solution is to use the Amazon Redshift Data API. This API allows you to execute SQL queries directly from within your trading application over HTTPS, which eliminates the need to manage and maintain complex connection setups such as WebSocket or JDBC. This significantly reduces the operational overhead and provides an efficient, scalable way to interact with your data in real-time.
Creating an Athena workgroup for each use case allows for isolation and management of different workloads, users, and permissions. This approach ensures that query processes and access to query history are separated effectively for each use case. By applying tags to the workgroups and creating IAM policies based on those tags, permissions can be managed in a more organized and simplified manner. This solution meets the requirements for permission controls across users, teams, and applications within the same AWS account.
Choosing the FLEX execution class in the Glue job properties leverages spare capacity within the AWS infrastructure to run Glue jobs at a discounted price compared to the STANDARD execution class. Since the data engineer does not require the jobs to run or finish at a specific time, utilizing this class is the most cost-effective solution. FLEX execution takes advantage of idle resources, thereby reducing costs significantly.
To configure an AWS Lambda function to process .csv files uploaded to an S3 bucket with the least operational overhead, you should create an S3 event notification that triggers when an object is created (i.e., s3:ObjectCreated:*). You can apply a filter rule to ensure the notification is generated only when the suffix includes .csv. This setup automatically invokes the Lambda function directly whenever a .csv file is uploaded, removing the need for additional services such as Amazon SNS, thus reducing operational complexity and overhead.