MLS-C01 Exam - Question 30

Question

When submitting Amazon SageMaker training jobs using one of the built-in algorithms, which common parameters MUST be specified? (Choose three. ).

Examice · Accepted Answer

When submitting Amazon SageMaker training jobs using built-in algorithms, the following parameters must be specified: The training channel identifying the location of training data, as SageMaker needs to access the training data during the training process; The IAM role that SageMaker can assume to perform tasks on behalf of the users, as this is necessary for permission handling and accessing resources such as S3 buckets; and the output path specifying where on an Amazon S3 bucket the trained model will persist, as SageMaker needs to know where to store the resulting model artifacts. Therefore, answers A, C, and F are correct.

DonaldCMLIN · Answer

THE ANSWER SHOUD BE CEF
IAM ROLE, INSTANCE TYPE, OUTPUT PATH

VB · Answer

From here https://docs.aws.amazon.com/zh_tw/sagemaker/latest/dg/API_CreateTrainingJob.html .. the only "Required: Yes" attributes are: 
1. AlgorithmSpecification (in this TrainingInputMode is Required - i.e. File or Pipe)
2. OutputDataConfig (in this S3OutputPath is Required - where the model artifacts are stored)
3. ResourceConfig (in this EC2 InstanceType and VolumeSizeInGB are required)
4. RoleArn (..The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf...the caller of this API must have the iam:PassRole permission.)
5. StoppingCondition
6. TrainingJobName (The name of the training job. The name must be unique within an AWS Region in an AWS account.)

From the given options in the questions.. we have 2, 3, and 4 above. so, the answer is CEF.

Neet1983 · Answer

A. The training channel identifying the location of training data on an Amazon S3 bucket: This is essential because SageMaker needs to know where to find the data for training.

C. The IAM role that Amazon SageMaker can assume to perform tasks on behalf of the users: SageMaker requires permissions to access resources on behalf of the user, and this is provided by specifying an IAM role with the necessary policies attached.

F. The output path specifying where on an Amazon S3 bucket the trained model will persist: After the model is trained, SageMaker needs to save the output, which includes the model artifacts, to a specified S3 location.

mcwithimp · Answer

https://docs.aws.amazon.com/zh_tw/sagemaker/latest/dg/API_CreateTrainingJob.html

The answer should be CEF

The only attributes strictly required("Required: Yes" ) are:
TrainingJobName
AlgorithmSpecification
OutputDataConfig
ResourceConfig
RoleArn
StoppingCondition

So, why is 'InputDataConfig' strictly required?
When 'InputDataConfig' is not needed:
Algorithm with Pre-loaded Data where  the data is already embedded or hardcoded within the training script or Docker container or pre-defined datasets available within SageMaker
Algorithm Generating (training) Data such as synthetic data generation or reinforcement learning scenarios

VR10 · Answer

E is not important, some models could simply work on the default of CPU.
A is a must and E is a must too.
C is important for permission handling on S3 etc.
It has to be A, C, F

rav009 · Answer

I open the sagemaker and tested. A C F
B is not needed for non-supervised algorithm.

rookiee1111 · Answer

The input channel and output channel are mandatory, as the training job needs to know where to get the input data from and where to publish the model artifact. IAM role is also needed, for AWS services. others are not mandatory, validation channel is not mandatory for instance in case of unsupervised learning, likewise hyper params can be be auto tuned for as well as the ec2 instance types can be default ones that will be picked

phdykd · Answer

ACF is answer

Swagata23 · Answer

Please go through the lab https://catalog.us-east-1.prod.workshops.aws/workshops/63069e26-921c-4ce1-9cc7-dd882ff62575/en-US/lab2

Alice1234 · Answer

A. The training channel identifying the location of training data on an Amazon S3 bucket: This is where SageMaker will get the input data for training the model.
C. The IAM role that Amazon SageMaker can assume to perform tasks on behalf of the users: This role provides SageMaker the necessary permissions to access AWS resources.
F. The output path specifying where on an Amazon S3 bucket the trained model will persist: After training, the model artifacts need to be saved in a specified S3 bucket location.

vkbajoria · Answer

C, E, F
The trick was the training channel, but all the data channel are passed during when actually training the model using fit method

vkbajoria · Answer

Input is required only when calling Fit method. When initializing the Estimator, we do not need input

sachin80 · Answer

InputDataConffig is optional in create_training_job.Please check thte parameters that are required.
 So answer is SEF: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html

sachin80 · Answer

InputDataConffig is optional in create_training_job.Please check thte parameters that are required.
So answer is CEF: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html

Denise123 · Answer

As they narrowed it to S3, A is incorrect BUT when submitting Amazon SageMaker training jobs using one of the built-in algorithms, it is a MUST to identify the location of training data. While Amazon S3 is commonly used for storing training data, other sources like Docker containers, DynamoDB, or local disks of training instances can also be used. Therefore, specifying the location of training data is essential for SageMaker to know where to access the data during training.

So the right answer is CEF for me for this case... However if A was saying identify the location of training data, I think option A would be included in the MUST parameter.

ninomfr64 · Answer

Based on https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html

Required parameters are:
- AlgorithmSpecification (registry path of the Docker image with the training algorithm)
- OutputDataConfig (path to the S3 location where you want to store model artifacts)
- ResourceConfig (resources, including the ML compute instances and ML storage volumes, to use for model training)
- RoleArn
- StoppingCondition (time limit for training job)
- TrainingJobName

Thus, the answer is: C E F
wording for option E is inaccurate "EC2 instance class specifying whether training will be run using CPU or GPU" but they do it on purpose

RathanKalluri · Answer

CEF 
https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html#API_CreateTrainingJob_RequestParameters

MLS-C01 Exam - Question 30

Discussion