Professional Data Engineer Exam QuestionsBrowse all questions from this exam

Professional Data Engineer Exam - Question 253


You are deploying a batch pipeline in Dataflow. This pipeline reads data from Cloud Storage, transforms the data, and then writes the data into BigQuery. The security team has enabled an organizational constraint in Google Cloud, requiring all Compute Engine instances to use only internal IP addresses and no external IP addresses. What should you do?

Show Answer
Correct Answer: D

To allow Compute Engine instances with only internal IP addresses to reach Google APIs and services such as Cloud Storage and BigQuery, you must enable Private Google Access in the subnetwork. This ensures that your Dataflow workers can access the necessary resources without needing external IP addresses, thus complying with the organizational constraint.

Discussion

8 comments
Sign in to comment
raaadOption: D
Jan 5, 2024

- Private Google Access for services allows VM instances with only internal IP addresses in a VPC network or on-premises networks (via Cloud VPN or Cloud Interconnect) to reach Google APIs and services. - When you launch a Dataflow job, you can specify that it should use worker instances without external IP addresses if Private Google Access is enabled on the subnetwork where these instances are launched. - This way, your Dataflow workers will be able to access Cloud Storage and BigQuery without violating the organizational constraint of no external IPs.

Jordan18
Jan 7, 2024

why not C?

GCP001
Jan 8, 2024

Even if you create VPC service control, your dataflow worker will run on google compute engine instances with private ips only after policy enforcement. Without external IP addresses, you can still perform administrative and monitoring tasks. You can access your workers by using SSH through the options listed in the preceding list. However, the pipeline cannot access the internet, and internet hosts cannot access your Dataflow workers.

GCP001
Jan 8, 2024

ref - https://cloud.google.com/dataflow/docs/guides/routes-firewall

BIGQUERY_ALT_ALT
Jan 11, 2024

VPC Service Controls are typically used to define and enforce security perimeters around APIs and services, restricting their access to a specified set of Google Cloud projects. In this scenario, the security constraint is focused on Compute Engine instances used by Dataflow, and VPC Service Controls might be considered a bit heavy-handed for just addressing the internal IP address requirement.

GCP001Option: D
Jan 8, 2024

https://cloud.google.com/dataflow/docs/guides/routes-firewall

scaenruyOption: C
Jan 3, 2024

C. Create a VPC Service Controls perimeter that contains the VPC network and add Dataflow, Cloud Storage, and BigQuery as allowed services in the perimeter. Use Dataflow with only internal IP addresses.

BIGQUERY_ALT_ALT
Jan 11, 2024

C is wrong. Option D is simple and straight forward. VPC Service Controls are typically used to define and enforce security perimeters around APIs and services, restricting their access to a specified set of Google Cloud projects. In this scenario, the security constraint is focused on Compute Engine instances used by Dataflow, and VPC Service Controls might be considered a bit heavy-handed for just addressing the internal IP address requirement.

Matt_108Option: C
Jan 13, 2024

Option D, as GCP001 said

Matt_108
Jan 13, 2024

Missclicked the answer <.<

pandeyspecialOption: C
Jan 28, 2024

It should be C

TryolabsOption: D
Feb 29, 2024

https://cloud.google.com/vpc/docs/private-google-access "VM instances that only have internal IP addresses (no external IP addresses) can use Private Google Access. They can reach the external IP addresses of Google APIs and services."

Moss2011Option: C
Mar 1, 2024

According to this documentation: https://cloud.google.com/vpc-service-controls/docs/overview I think the correct answer is C. Take into account the phrase "organizational constraint" and the VPC Service Control allow you to do that.

LestrangOption: D
Jun 8, 2024

No way it is C. Like the use case for Google VPC Service Controls perimeter is not to establish secure connectivity on its own but rather to control connectivity, like allowing vms within x premise to access, and blocking vms outside premise even if in same VPC from access. D on the other hand is completely sensical.