Exam SAA-C03 All QuestionsBrowse all questions from this exam
Question 603

A company recently migrated to the AWS Cloud. The company wants a serverless solution for large-scale parallel on-demand processing of a semistructured dataset. The data consists of logs, media files, sales transactions, and IoT sensor data that is stored in Amazon S3. The company wants the solution to process thousands of items in the dataset in parallel.

Which solution will meet these requirements with the MOST operational efficiency?

    Correct Answer: B

    The most operationally efficient solution for large-scale parallel on-demand processing of a semistructured dataset stored in Amazon S3 is to use the AWS Step Functions Map state in Distributed mode. Distributed mode can handle a higher concurrency level compared to Inline mode, allowing up to 10,000 parallel branches, which is suitable for processing thousands of items simultaneously. This serverless solution automatically scales and manages the infrastructure, ensuring efficient processing with minimal operational overhead.

Discussion
Guru4CloudOption: B

AWS Step Functions allows you to orchestrate and scale distributed processing using the Map state. The Map state can process items in a large dataset in parallel by distributing the work across multiple resources. Using the Map state in Distributed mode will automatically handle the parallel processing and scaling. Step Functions will add more workers to process the data as needed. Step Functions is serverless so there are no servers to manage. It will scale up and down automatically based on demand.

Lx016

A Map in Inline mode can support concurrency of 40 parallel branches and execution history limits of 25,000 events or approximately 6,500 state transitions in a workflow. With the Distributed mode, you can run at concurrency of up to 10,000 parallel branches. So I believe if it has to process thousands of items in parallel Distributed Mode is more appropriate

awsgeek75Option: B

https://aws.amazon.com/blogs/aws/step-functions-distributed-map-a-serverless-solution-for-large-scale-parallel-data-processing/ https://docs.aws.amazon.com/step-functions/latest/dg/sample-dist-map-s3data-process.html

TariqKipkemeiOption: B

The Distributed Map has been optimized for Amazon S3.,helping you more easily iterate over objects in an S3 bucket. With the Distributed mode, you can run at concurrency of up to 10,000 parallel branches. https://aws.amazon.com/step-functions/faqs/#:~:text=A%20Map%20in%20Inline%20mode,up%20to%2010%2C000%20parallel%20branches.

taustin2Option: B

With Step Functions, you can orchestrate large-scale parallel workloads to perform tasks, such as on-demand processing of semi-structured data. These parallel workloads let you concurrently process large-scale data sources stored in Amazon S3. https://docs.aws.amazon.com/step-functions/latest/dg/concepts-orchestrate-large-scale-parallel-workloads.html

Sugarbear_01

After going through the link I confirmed the answer is B

Sandy1254Option: B

https://docs.aws.amazon.com/step-functions/latest/dg/use-dist-map-orchestrate-large-scale-parallel-workloads.html

bogdannbOption: C

Using step functions will be overwill from my point of view. I would use Glue, it’s serverless and purposely designed for such use case

Lin878Option: B

Simple - user Lambda / Complex - user Step Functions

Sugarbear_01Option: B

https://docs.aws.amazon.com/step-functions/latest/dg/concepts-orchestrate-large-scale-parallel-workloads.html

[Removed]

Large Scale + Parallel = Distributed Step Function https://docs.aws.amazon.com/step-functions/latest/dg/concepts-inline-vs-distributed-map.html