Professional Cloud Architect Exam - Question 119

Question

You need to migrate Hadoop jobs for your company's Data Science team without modifying the underlying infrastructure. You want to minimize costs and infrastructure management effort. What should you do?

Examice · Accepted Answer

To migrate Hadoop jobs while minimizing costs and infrastructure management effort, the best approach would be to create a Dataproc cluster using preemptible worker instances. Dataproc is a managed service that simplifies the deployment and management of Hadoop clusters, reducing administrative overhead. Preemptible instances are cost-effective as they are cheaper than standard instances. Although they are short-lived and can be terminated by Google Cloud at any time, this cost reduction can be beneficial for non-critical, fault-tolerant data processing tasks where cost minimization is a priority.

TotoroChina · Answer

Should be B,  you want to minimize costs.
https://cloud.google.com/dataproc/docs/concepts/compute/secondary-vms#preemptible_and_non-preemptible_secondary_workers

firecloud · Answer

It's A, the primary workers can only be standard, where secondary workers can be preemtible.------In addition to using standard Compute Engine VMs as Dataproc workers (called "primary" workers), Dataproc clusters can use "secondary" workers.
There are two types of secondary workers: preemptible and non-preemptible. All secondary workers in your cluster must be of the same type, either preemptible or non-preemptible. The default is preemptible.

CyanideX · Answer

The Answer should be Spot VMs (This has been changed in actual Exam) Spot VM's have no expiration.

d0094d6 · Answer

You want to minimize costs and infrastructure management effort > B

Pime13 · Answer

minimize costs -> preemtipble

odacir · Answer

B. 
- migrate Hadoop jobs -> dataproc
- saving money -> preemptible(spot)

discuss24 · Answer

A is the correct response. Per documentation "You can gain low-cost processing power for your jobs by adding preemptible worker nodes to your cluster. These nodes use preemptible virtual machines." The focus of the question is to reduce cost, hence preempttible VM works best

Romio2023 · Answer

Answer should be B, because minizing the costs is wanted.

didek1986 · Answer

It is A

d0094d6 · Answer

"You want to minimize costs and infrastructure management effort" -> B

Amrita2012 · Answer

Using standard Compute Engine VMs as Dataproc workers (called "primary" workers), Preemptible can be only used for secondary workers hence A is valid answer

VidhyaBupesh · Answer

Using preemptible VMs does not always save costs since preemptions can cause longer job execution with resulting higher job costs

shashii82 · Answer

Dataproc: Dataproc is a fully managed Apache Spark and Hadoop service on Google Cloud Platform. It allows you to run clusters without the need to manually deploy and manage Hadoop clusters on Compute Engine.

Preemptible Worker Instances: Preemptible instances are short-lived, cost-effective virtual machine instances that are suitable for fault-tolerant and batch processing workloads. Since Hadoop jobs can often tolerate interruptions, using preemptible instances can significantly reduce costs.

Option B leverages the benefits of Dataproc for managing Hadoop clusters without the need for manual deployment and takes advantage of preemptible instances to minimize costs. This aligns well with the goal of minimizing both costs and infrastructure management efforts.

Diwz · Answer

Answer is B. 
The  secondary worker type instance for default  Dataproc cluster is preemptible VMs.
https://cloud.google.com/dataproc/docs/concepts/compute/secondary-vms

dija123 · Answer

Agree with B

Gino17m · Answer

A - only secondary workers can be preemptible and "Using preemptible VMs does not always save costs since preemptions can cause longer job execution with resulting higher job costs" according to: https://cloud.google.com/dataproc/docs/concepts/compute/secondary-vms#preemptible_and_non-preemptible_secondary_workers

afsarkhan · Answer

I will go with A , reason preemptible instances are unpredictable and there is no mention of work criticallity. So my answer is A against B

Professional Cloud Architect Exam - Question 119

Discussion