AWS Certified Machine Learning Engineer - Associate MLA-C01 Exam - Question 91

Question

A company runs Amazon SageMaker ML models that use accelerated instances. The models require real-time responses. Each model has different scaling requirements. The company must not allow a cold start for the models.

Which solution will meet these requirements?

Examice · Accepted Answer

eesa · Answer

✅ Explanation:
Requirements Recap:

Real-time inference: Needs low-latency predictions.
    Accelerated instances: Likely GPU-backed, costly to scale inefficiently.
    No cold starts: Endpoints must always be warm and responsive.
    Each model has different scaling needs: Must support independent scaling of each model.

✅ Why Option C is correct:

Inference components are a new SageMaker feature that allow:
        Hosting multiple models on a single endpoint.
        Independent scaling of each model (component).
        Avoiding cold starts via minimum number of copies.
    Setting min invocations or min replicas ≥ 1 keeps the model always warm, eliminating cold starts.
    This solution meets all requirements efficiently.

ygn4ei · Answer

this is correct

AWS Certified Machine Learning Engineer - Associate MLA-C01 Exam - Question 91

Discussion