Exam MLS-C01 All QuestionsBrowse all questions from this exam
Question 256

A company is creating an application to identify, count, and classify animal images that are uploaded to the company’s website. The company is using the Amazon SageMaker image classification algorithm with an ImageNetV2 convolutional neural network (CNN). The solution works well for most animal images but does not recognize many animal species that are less common.

The company obtains 10,000 labeled images of less common animal species and stores the images in Amazon S3. A machine learning (ML) engineer needs to incorporate the images into the model by using Pipe mode in SageMaker.

Which combination of steps should the ML engineer take to train the model? (Choose two.)

    Correct Answer: C, D

    To incorporate the 10,000 labeled images of less common animal species into the model using Pipe mode in SageMaker, the ML engineer should first create a .lst file containing a list of image files and their corresponding class labels, uploading this file to Amazon S3. This step ensures that the data is properly formatted for SageMaker's image classification algorithm. Next, the ML engineer should initiate transfer learning by training the model using the images of these less common species. Transfer learning will allow the model to leverage pre-trained weights and adapt to the new dataset more efficiently, improving its accuracy in recognizing the less common animal species.

Discussion
loictOptions: CD

A. NO - we can't change the model for transfer learning B. NO - we can't change the model for transfer learning C. YES - lst file is how we give input to SageMaker (https://medium.com/@texasdave2/itty-bitty-lst-file-format-converter-for-machine-learning-image-classification-on-aws-sagemaker-b3828c7ba9cc) D. YES - obvious E. NO - there is no extra metadata we want to provide (https://docs.aws.amazon.com/sagemaker/latest/dg/augmented-manifest.html)

giustino98Options: CD

A False - no transfer learning B False - no transfer learning C True - "If you use the Image format for training, specify train, validation, train_lst, and validation_lst channels as values for the InputDataConfig parameter of the CreateTrainingJob request. Specify the individual image data (.jpg or .png files) for the train and validation channels. Specify one .lst file in each of the train_lst and validation_lst channels. Set the content type for all four channels to application/x-image." D True - it uses transfer learning E False - "To include metadata with your dataset in a training job, use an augmented manifest file. " Here we don't have any metadata

brianb08Options: CD

C. Create a .lst file that contains a list of image files and corresponding class labels. Upload the .lst file to Amazon S3. D. Initiate transfer learning. Train the model by using the images of less common species. Details provided in this blog post: https://aws.amazon.com/blogs/machine-learning/classify-your-own-images-using-amazon-sagemaker/

JonSno

Create a .lst File (Option C): Explanation: The .lst file is a standard format used with Amazon SageMaker's image classification algorithm, which lists image files and their corresponding labels. This file is crucial for SageMaker to read and map the images correctly for training purposes. The .lst file needs to be uploaded to Amazon S3. Initiate Transfer Learning (Option D): Explanation: Transfer learning allows you to leverage pre-trained weights from existing models (like ImageNetV2) and fine-tune them using your own data. In this case, training with the 10,000 new labeled images helps the model recognize less common animal species. Transfer learning is more efficient since the model has already been trained on similar data.

cloudera3

Link for more information on the .lst requirement for training and validation channels: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html#IC-inputoutput

backbencher2022Options: DE

D is obvious. Reason for option E is because they want to train model in Pipe mode and using an augmented manifest file in JSON Lines format enables trraining in Pipe mode as per this link - https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html Refer to the section - Train with Augmented Manifest Image Format in this link for more details. Using an augmented manifest file is an alternative to preprocessing when you have labeled data. For training jobs using labeled data, you typically need to preprocess the dataset to combine input data with metadata before training. If your training dataset is large, preprocessing can be time consuming and expensive.

goku58Options: DE

D is obvious. Keyword here is pipe mode. "The augmented manifest format enables you to do training in Pipe mode using image files without needing to create RecordIO files." Hence, E.

kaike_reisOptions: DE

Letter A - B deviate from what is asked in the scope of the question. Correct alternatives are E - D. To understand that C is wrong, look here: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html TL;DR - .lst file is only for classification task

awsarchitect5Options: DE

Augmented manifest format enables you to do training in Pipe mode using files without needing to create RecordIO files (.rec)

AIWaveOptions: DE

A - no need for full training only transfer learning B - no built in inception model in sagemaker and no transfer learning C - no data augmentation could introduce inaccuracies D - yes! Transfer learning E - yes, augmentation improves accuracy

endeesaOptions: DE

We need to augment the existing images so E makes sense

DimLamOptions: DE

I would go with E. as we have a hint in the question, that we need to use a Pipe mode, and E is used for pipe mode

Mickey321Options: DE

https://docs.aws.amazon.com/sagemaker/latest/dg/augmented-manifest.html

ADVIT

D+E https://docs.aws.amazon.com/sagemaker/latest/dg/augmented-manifest.html

SandeepGunOptions: DE

Correct Ans DE