MLS-C01 Exam - Question 305

Question

A company wants to forecast the daily price of newly launched products based on 3 years of data for older product prices, sales, and rebates. The time-series data has irregular timestamps and is missing some values.

Data scientist must build a dataset to replace the missing values. The data scientist needs a solution that resamples the data daily and exports the data for further modeling.

Which solution will meet these requirements with the LEAST implementation effort?

Examice · Accepted Answer

Amazon SageMaker Studio Data Wrangler is designed for visual and interactive data preparation, making it easy for data scientists to clean, transform, and analyze data with minimal code. It offers over 250 built-in transformations including resampling, which is required to achieve daily resampling of time-series data. Additionally, Data Wrangler provides the capability to handle missing values efficiently and export the prepared dataset for further modeling. This makes it the most suitable option with the least implementation effort for the given requirements.

GS_77 · Answer

While SageMaker Data Wrangler (option C) is also a strong contender, DataBrew is slightly easier to use and requires even less implementation effort, especially for users who may not be as familiar with the SageMaker ecosystem.

kyuhuck · Answer

Answer: C
Explanation:
Amazon SageMaker Studio Data Wrangler is a visual data preparation tool that enables users to clean
and normalize data without writing any code. Using Data Wrangler, the data scientist can easily
import the time-series data from various sources, such as Amazon S3, Amazon Athena, or Amazon
Redshift. Data Wrangler can automatically generate data insights and quality reports, which can help
identify and fix missing values, outliers, and anomalies in the data. Data Wrangler also provides over
250 built-in transformations, such as resampling, interpolation, aggregation, and filtering, which can
be applied to the data with a point-and-click interface. Data Wrangler can also export the prepared
data to different destinations, such as Amazon S3, Amazon SageMaker Feature Store, or Amazon
SageMaker Pipelines, for further modeling and analysis. D

akdavsan · Answer

This is exactly what Data Wrangler is for

Adzz · Answer

Best for Data Wrangler

AIWave · Answer

Data wrangler supports tight integration with Sagemaker and is better suited for this scenario since resampled data is used in further modelling.
AWS Glue DataBrew is a data preparation service more for general purpose use.

vkbajoria · Answer

Data Wrangler is better for ML work. Brew can be used as well

Togy · Answer

There is a need for scheduling daily resampling. This can be automated in Databrew more easily than in Data Wrangler.

youonebe · Answer

Databrew lacks explicit time-series resampling features; focuses on general ETL, not forecasting workflows.

MLS-C01 Exam - Question 305

Discussion