An online retail company has an application that runs on Amazon EC2 instances that are in a VPC. The company wants to collect flow logs for the VPC and analyze network traffic.
Which solution will meet these requirements MOST cost-effectively?
An online retail company has an application that runs on Amazon EC2 instances that are in a VPC. The company wants to collect flow logs for the VPC and analyze network traffic.
Which solution will meet these requirements MOST cost-effectively?
Publishing flow logs to Amazon S3 in Apache Parquet format and using Amazon Athena for analytics is the most cost-effective solution. Apache Parquet is a columnar storage file format which is highly efficient for both storage and query performance. Due to its compression capabilities and the nature of storing data in columns, it reduces the storage costs and optimizes query performance in Amazon Athena. Using Amazon S3 as the storage mechanism further ensures cost-effectiveness due to its scalable storage pricing.
Publishing flow logs to Amazon S3 in Apache Parquet format and using Amazon Athena for analytics (D) is the most cost-effective solution. This approach minimizes storage costs due to the efficient compression of Parquet, and optimizes query performance and cost in Athena due to the reduced data size and optimized columnar storage.
Flow Logs can be published to S3 in Parquet format: https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs-s3.html#flow-logs-s3-path
https://aws.amazon.com/about-aws/whats-new/2021/10/amazon-vpc-flow-logs-parquet-hive-prefixes-partitioned-files/
Flow logs acn be published to S3 but then option D sas in Parquet format - it is not automatically converted into parquet.... https://aws.amazon.com/solutions/implementations/centralized-logging-with-opensearch/
Apache parquet and S3 = most cost-effective solution