Professional Data Engineer Exam QuestionsBrowse all questions from this exam

Professional Data Engineer Exam - Question 216


You are developing an Apache Beam pipeline to extract data from a Cloud SQL instance by using JdbcIO. You have two projects running in Google Cloud. The pipeline will be deployed and executed on Dataflow in Project A. The Cloud SQL. instance is running in Project B and does not have a public IP address. After deploying the pipeline, you noticed that the pipeline failed to extract data from the Cloud SQL instance due to connection failure. You verified that VPC Service Controls and shared VPC are not in use in these projects. You want to resolve this error while ensuring that the data does not go through the public internet. What should you do?

Show Answer
Correct Answer: D

When the Cloud SQL instance does not have a public IP address and VPC Service Controls and shared VPC are not being used, the correct approach involves creating a direct communication path between the projects without exposing the data to the public internet. VPC Network Peering can allow internal IP communication between the projects, but it's essential to note that Cloud SQL instances are part of a managed service that may not be directly addressable even within a peered network, especially for private IP configurations. By creating a Compute Engine instance without an external IP address in Project B to act as a proxy within the peered VPC network, you ensure that the Dataflow workers in Project A can securely connect to the Cloud SQL instance without data traversing the public internet. This method uses the internal network paths allowed by VPC Peering and maintains the required security and privacy. Therefore, setting up VPC Network Peering and using a proxy server is the justified solution for this scenario.

Discussion

17 comments
Sign in to comment
MaxNRGOption: D
Jan 7, 2024

D is the correct solution. To allow the Dataflow workers in Project A to connect to the private Cloud SQL instance in Project B, you need to set up VPC Network Peering between the two projects. Then create a Compute Engine instance without external IP in Project B on the peered subnet. This instance can serve as a proxy server to connect to the private Cloud SQL instance. The Dataflow workers can connect through the peered network to the proxy instance, which then connects to Cloud SQL. This allows accessing the private Cloud SQL instance without going over the public internet. Option A would allow access but still goes over the public internet. Option B and C would not work since the Cloud SQL instance does not have a public IP address. So D is the right approach to resolve the connection issue while keeping the data private.

chrissamharris
Feb 27, 2024

I think you're incorrect. VPC Peering does not traverse the public internet. https://cloud.google.com/vpc/docs/using-vpc-peering

BIGQUERY_ALT_ALTOption: D
Jan 10, 2024

Option D is the correct answer. The reason is you cannot access cloud sql or alloydb instances from a peered vpc connection as they will be hosted in service project not in Project B. The VPC Peering doesn't give transitive routing so accessing cloud sql directly is not possible without a proxy vm. https://cloud.google.com/vpc/docs/vpc-peering#spec-general

chrissamharrisOption: A
Feb 26, 2024

A - The requirement for a proxy is un-necessary: https://cloud.google.com/sql/docs/mysql/private-ip#multiple_vpc_connectivity

raaadOption: A
Jan 4, 2024

VPC Network Peering allows for the connection of two VPC networks so that they can communicate internally as if they were part of the same network.

Anudeep58
May 24, 2024

The Cloud SQL. instance is running in Project B and does not have a public IP address. Correct would be D. Any thoughts ?

lipa31Option: D
Jan 25, 2024

the reason : Cloud SQL supports private IP addresses through private service access. When you create a Cloud SQL instance, Cloud SQL creates the instance within its own virtual private cloud (VPC), called the Cloud SQL VPC. Enabling private IP requires setting up a peering connection between the Cloud SQL VPC and your VPC network.

ML6Option: D
Feb 19, 2024

Option D. Source: https://cloud.google.com/sql/docs/mysql/private-ip#multiple_vpc_connectivity

josechOption: D
May 18, 2024

https://cloud.google.com/sql/docs/mysql/connect-multiple-vpcs

ccpmadOption: A
May 21, 2024

Proxy? no, it is not necessary.. A

fabiogomaOption: A
May 24, 2024

Why so many people are voting for D? There's no need for a proxy, the peering is enough to allow network traffic between subnets.

fabiogoma
May 24, 2024

Now I see why, I put this on ChatGPT and it thinks the right answer is D. I'm pretty sure that's a hallucination.

Matt_108Option: D
Jan 13, 2024

Option D is the most aligned to best practices for me

datapassionateOption: D
Jan 15, 2024

D. Set up VPC Network Peering between Project A and Project B. Create a Compute Engine instance without external IP address in Project B on the peered subnet to serve as a proxy server to the Cloud SQL database.

saschak94Option: D
Jan 16, 2024

Using VPC Network Peering, Cloud SQL implements private service access internally, which allows internal IP addresses to connect across two VPC networks regardless of whether they belong to the same project or organization. However, since VPC Network Peering isn't transitive, it only broadcasts routes between the two VPCs that are directly peered. If you have an additional VPC, it won't be able to access your Cloud SQL resources using the connection set up with your original VPC.

JyoGCPOption: D
Feb 17, 2024

Option D

LestrangOption: A
Jun 8, 2024

People referencing "VPC Network Peering does not provide transitive routing. For example, if VPC networks net-a and net-b are connected using VPC Network Peering, and VPC networks net-a and net-c are also connected using VPC Network Peering, VPC Network Peering does not provide connectivity between net-b and net-c." the question states that cloud sql is running in project B. Which means the instance is already part of the VPC in project B, so with Network Peering workers from A can definitely access data in B. No proxy is needed.

kajitsuOption: A
Jul 2, 2024

no proxy needed

LenifiaOption: A
Jul 6, 2024

A is correct

kk1211
Jul 11, 2024

still confused between A and D