Professional Data Engineer Exam QuestionsBrowse all questions from this exam

Professional Data Engineer Exam - Question 295


You are designing the architecture to process your data from Cloud Storage to BigQuery by using Dataflow. The network team provided you with the Shared VPC network and subnetwork to be used by your pipelines. You need to enable the deployment of the pipeline on the Shared VPC network. What should you do?

Show Answer
Correct Answer: B

To enable the deployment of a Dataflow pipeline on a Shared VPC network, you should assign the compute.networkUser role to the service account that executes the Dataflow pipeline. This role gives the service account the necessary permissions to create and manage network resources on the Shared VPC network, which is essential for deploying and running the pipeline.

Discussion

9 comments
Sign in to comment
raaadOption: A
Jan 11, 2024

- Dataflow service agent is the one responsible for setting up and managing the network resources that Dataflow requires. - By granting the compute.networkUser role to this service agent, we are enabling it to provision the necessary network resources within the Shared VPC for your Dataflow job.

task_7Option: B
Jan 12, 2024

compute.networkUser to the service account that executes the Dataflow pipeline.

Matt_108Option: A
Jan 13, 2024

Option A, I do agree with Raaad, it's the dataflow service agent that needs the networkUser role, because it's the one that provisions the network resources https://cloud.google.com/dataflow/docs/guides/specifying-networks#shared

tibuenoc
Feb 8, 2024

But your link it's explain that "Network User role must be assigned to the Dataflow service account" Make sure the Shared VPC subnetwork is shared with the Dataflow service account and has the Compute Network User role assigned on the specified subnet. The Compute Network User role must be assigned to the Dataflow service account in the host project.

ML6
Feb 18, 2024

All projects that have used the resource Dataflow Job have a Dataflow Service Account, also known as the Dataflow service agent. Source: https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#df-service-account

saschak94Option: A
Feb 9, 2024

All projects that have used the resource Dataflow Job have a Dataflow Service Account, also known as the Dataflow service agent. Make sure the Shared VPC subnetwork is shared with the Dataflow service account and has the Compute Network User role assigned on the specified subnet.

GCP001Option: B
Jan 7, 2024

B. Assign the compute.networkUser role to the service account that executes the Dataflow pipeline. See the ref - https://cloud.google.com/dataflow/docs/guides/specifying-networks

raaad
Jan 11, 2024

Option A makes more sense: - Assigning the compute.networkUser role to the pipeline's service account grants it unnecessary and possibly excessive permissions outside its core responsibility of data processing. The question focused specifically on the deployment aspect (i.e., provisioning of network resources like VMs) rather than what the pipeline accesses or processes once it's running.

GCP001
Jan 17, 2024

Yes , I agree, it should be A. Dataflow service account should be the one having this permission instaed of worker

chrissamharrisOption: B
May 1, 2024

I believe the answer is B. All authentication documentation points to Service Accounts. https://cloud.google.com/dataflow/docs/concepts/authentication#on-gcp Dataflow service agent typically manages general interactions with the Dataflow service but does not execute the actual jobs.

josechOption: B
May 20, 2024

Option B https://cloud.google.com/knowledge/kb/dataflow-job-in-shared-vpc-xpn-permissions-000004261

BIGQUERY_ALT_ALTOption: B
Jan 11, 2024

Option B is Correct. Explanation: You need to give compute networkuser role to service account that is processing the pipeline as it will need to deploy nessesary worker nodes on the shared vpc project. Option A is incorrect as Dataflow Service Agent is Google MGS service account that will not responsible for running or deoplying workers in shared vpc. Option C and D is incorrect as dataflow.admin is elevated privlages to create and manage all of dataflow components not deploying resources in shared vpc.

extraegoOption: B
Jun 10, 2024

Dataflow service agent is a role that is assigned to a service account. So is compute.networkUser. https://cloud.google.com/dataflow/docs/concepts/access-control#example