DEA-C01 Exam - Question 27

Question

A company wants to implement real-time analytics capabilities. The company wants to use Amazon Kinesis Data Streams and Amazon Redshift to ingest and process streaming data at the rate of several gigabytes per second. The company wants to derive near real-time insights by using existing business intelligence (BI) and analytics tools.

Which solution will meet these requirements with the LEAST operational overhead?

Examice · Accepted Answer

To achieve near real-time analytics with the least operational overhead, the best solution is to create an external schema in Amazon Redshift that maps data from Amazon Kinesis Data Streams, then create a materialized view on this schema set to auto refresh. This method directly integrates with Kinesis Data Streams and uses Redshift's capabilities to keep the data updated without having to transfer it through multiple stages or temporary storage locations, minimizing operational complexity.

helpaws · Answer

Key word here is near real-time. If it's involve S3 and COPY, it's not gonna be near real-time

fceb2c1 · Answer

https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion-getting-started.html

C is correct. (KDS -> Redshift)
D is wrong as it has more operational overhead (KDS -> KDF -> S3 -> Redshift)

blackgamer · Answer

The answer is C. It can provide near real-time insight analysis. Refer the article from AWS - https://aws.amazon.com/blogs/big-data/real-time-analytics-with-amazon-redshift-streaming-ingestion/

Christina666 · Answer

Using materialized views with auto-refresh directly on a Redshift external schema of Kinesis Data Stream offers the most streamlined and efficient approach for near real-time insights using existing BI tools.

Aesthet · Answer

Both ChatGPT and I are thinking D is correct (100%)

GiorgioGss · Answer

I think D. It could be C but because of "LEAST operational overhead" I will go with D.

certplan · Answer

Point: "Which solution will meet these requirements with the LEAST operational overhead?"

C.   - This approach involves creating an external schema in Amazon Redshift to map data from Kinesis Data Streams, which adds complexity compared to directly loading data from Amazon S3 using Amazon Kinesis Data Firehose.
   - While materialized views with auto-refresh can provide near real-time insights, managing them and ensuring proper synchronization with the streaming data source may require more operational effort.
   - AWS documentation for Amazon Redshift primarily focuses on traditional data loading methods and querying, with limited guidance on integrating with real-time data sources like Kinesis Data Streams.

certplan · Answer

By considering the characteristics and capabilities of each AWS service and approach, along with insights from AWS documentation, it becomes evident that option D offers the most streamlined and operationally efficient solution for the scenario described.

This idea/concept is also straight out of the Amazon Solutions Architect course material.

certplan · Answer

1. Amazon Kinesis Data Firehose: It's designed to reliably load streaming data into data lakes and data stores with minimal configuration and management overhead. It handles tasks like buffering, scaling, and delivering data to destinations like Amazon S3 and Amazon Redshift automatically.

2. Amazon S3 as a staging area: Storing data in Amazon S3 provides a scalable and durable solution for data storage without needing to manage infrastructure. It also allows for easy integration with other AWS services and existing BI and analytics tools.

3. Amazon Redshift: While Redshift requires some setup and management, loading data from Amazon S3 using the COPY command is a straightforward process. Once data is loaded into Redshift, existing BI and analytics tools can query the data directly, enabling near real-time insights.

4. Minimal operational overhead: This solution minimizes operational overhead because much of the management tasks, such as scaling, buffering, and delivery of data, are handled by Amazon Kinesis Data Firehose. Additionally, using Amazon S3 as a staging area simplifies data storage and integration with other services.

d8945a1 · Answer

MVs in Redshift with auto refresh is the best option for near real time.

bakarys · Answer

Option A (using Kinesis Data Streams to stage data in Amazon S3 and loading it directly into Amazon Redshift) is the most straightforward and efficient approach. It minimizes operational overhead and ensures immediate availability of data for analysis.
Options B and C introduce additional complexity and may not provide the same level of efficiency

DEA-C01 Exam - Question 27

Discussion