Exam DEA-C01 All QuestionsBrowse all questions from this exam
Question 27

A company wants to implement real-time analytics capabilities. The company wants to use Amazon Kinesis Data Streams and Amazon Redshift to ingest and process streaming data at the rate of several gigabytes per second. The company wants to derive near real-time insights by using existing business intelligence (BI) and analytics tools.

Which solution will meet these requirements with the LEAST operational overhead?

    Correct Answer: C

    To achieve near real-time analytics with the least operational overhead, the best solution is to create an external schema in Amazon Redshift that maps data from Amazon Kinesis Data Streams, then create a materialized view on this schema set to auto refresh. This method directly integrates with Kinesis Data Streams and uses Redshift's capabilities to keep the data updated without having to transfer it through multiple stages or temporary storage locations, minimizing operational complexity.

Discussion
fceb2c1Option: C

https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion-getting-started.html C is correct. (KDS -> Redshift) D is wrong as it has more operational overhead (KDS -> KDF -> S3 -> Redshift)

helpawsOption: C

Key word here is near real-time. If it's involve S3 and COPY, it's not gonna be near real-time

blackgamerOption: C

The answer is C. It can provide near real-time insight analysis. Refer the article from AWS - https://aws.amazon.com/blogs/big-data/real-time-analytics-with-amazon-redshift-streaming-ingestion/

Christina666Option: C

Using materialized views with auto-refresh directly on a Redshift external schema of Kinesis Data Stream offers the most streamlined and efficient approach for near real-time insights using existing BI tools.

bakarysOption: A

Option A (using Kinesis Data Streams to stage data in Amazon S3 and loading it directly into Amazon Redshift) is the most straightforward and efficient approach. It minimizes operational overhead and ensures immediate availability of data for analysis. Options B and C introduce additional complexity and may not provide the same level of efficiency

d8945a1Option: C

MVs in Redshift with auto refresh is the best option for near real time.

certplan

1. Amazon Kinesis Data Firehose: It's designed to reliably load streaming data into data lakes and data stores with minimal configuration and management overhead. It handles tasks like buffering, scaling, and delivering data to destinations like Amazon S3 and Amazon Redshift automatically. 2. Amazon S3 as a staging area: Storing data in Amazon S3 provides a scalable and durable solution for data storage without needing to manage infrastructure. It also allows for easy integration with other AWS services and existing BI and analytics tools. 3. Amazon Redshift: While Redshift requires some setup and management, loading data from Amazon S3 using the COPY command is a straightforward process. Once data is loaded into Redshift, existing BI and analytics tools can query the data directly, enabling near real-time insights. 4. Minimal operational overhead: This solution minimizes operational overhead because much of the management tasks, such as scaling, buffering, and delivery of data, are handled by Amazon Kinesis Data Firehose. Additionally, using Amazon S3 as a staging area simplifies data storage and integration with other services.

certplan

By considering the characteristics and capabilities of each AWS service and approach, along with insights from AWS documentation, it becomes evident that option D offers the most streamlined and operationally efficient solution for the scenario described. This idea/concept is also straight out of the Amazon Solutions Architect course material.

certplan

Point: "Which solution will meet these requirements with the LEAST operational overhead?" C. - This approach involves creating an external schema in Amazon Redshift to map data from Kinesis Data Streams, which adds complexity compared to directly loading data from Amazon S3 using Amazon Kinesis Data Firehose. - While materialized views with auto-refresh can provide near real-time insights, managing them and ensuring proper synchronization with the streaming data source may require more operational effort. - AWS documentation for Amazon Redshift primarily focuses on traditional data loading methods and querying, with limited guidance on integrating with real-time data sources like Kinesis Data Streams.

GiorgioGssOption: D

I think D. It could be C but because of "LEAST operational overhead" I will go with D.

Aesthet

Both ChatGPT and I are thinking D is correct (100%)

BartoszGolebiowski24

I think this is true. I could not find any sources that AWS Kinesis Data Stream can stream data directly into s3 without a middle step with AWS Kinesis Data Firehose. The AWS Kinesis Data Firehose is near real-time service. Anyway, I think the answer is D because the other 3 options are not better at all.

BartoszGolebiowski24

However, after some investigation, I found out that Amazon Kinesis Data Streams provides a way to ingest stream data directly into an Amazon Redshift cluster. https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html Materialized views with auto-refresh enabled will continuously ingest new data from the stream as it arrives, keeping the view updated with the latest data in real time. So I think the correct answer is C. The COPY command also supports loading data from streaming sources like Kinesis Data Streams or Kinesis Data Firehose. When used with these services, COPY provides a way to ingest real-time streaming data into Redshift tables. But this solution is not an option for this question.

LR2023

thank you, i was leaning towards D but this article helps