Professional Cloud Architect Exam - Question 122

Question

You have an application that runs in Google Kubernetes Engine (GKE). Over the last 2 weeks, customers have reported that a specific part of the application returns errors very frequently. You currently have no logging or monitoring solution enabled on your GKE cluster. You want to diagnose the problem, but you have not been able to replicate the issue. You want to cause minimal disruption to the application. What should you do?

Examice · Accepted Answer

To diagnose the frequent errors in the GKE application, you should enable Cloud Operations for GKE on your existing cluster and use the GKE Monitoring dashboard to investigate logs from affected Pods. This implementation provides monitoring and logging capabilities with minimal disruption to the application. Migrating to a new cluster or adding additional tools like Prometheus would add unnecessary complexity and potential disruptions.

TotoroChina · Answer

According to the reference, answer should be A.
https://cloud.google.com/blog/products/management-tools/using-logging-your-apps-running-kubernetes-engine

XDevX · Answer

IMHO a) is the correct answer, not c)
The point is, that we have a scenario in that often errors in GKE happen - within 2 week a lot of people complained about a lot of errors. For the past we have no data at all as we have not monitored anything. That means we will collect data from now on to find out what the problem is. The additional value of an alert is not clear - and it for me not clear why we need additionally to install Prometheus considering that until now we had no GKE monitoring at all. Please correct me if I am wrong.

JC0926 · Answer

A. 1. Update your GKE cluster to use Cloud Operations for GKE. 2. Use the GKE Monitoring dashboard to investigate logs from affected Pods.

By updating your GKE cluster to use Cloud Operations for GKE (formerly known as Stackdriver), you enable monitoring and logging without disrupting the application. The GKE Monitoring dashboard allows you to investigate logs from affected Pods, which helps you diagnose the problem that customers have reported. This approach minimizes disruption to the application while providing the necessary information to identify and resolve the issue

midori_jn · Answer

Could anyone kindly explain why B is incorrect? Thank you.

e5019c6 · Answer

If we enable Cloud Operations we should be able to see the logs from this point onwards. Data of past errors would not be visible. It's not rational to expect developers to check every hour for appearances of the error in the logs, and that's where an alert comes in handy. It'll notify you when the conditions that led to the error appear again so that developers can analyze the logs and understand the problem.
I agree that installing Prometheus is not needed today, but it seems that it was the only option at the time they created this question to set up alerts and, in my opinion, the alerts are vital to diagnose the problem.

kratosmat · Answer

As described here
https://cloud.google.com/stackdriver/docs/solutions/gke
is it possible to install prometheus as part of cloud operation suite.

3ana · Answer

hi guys, for those who have the complete questions for this PCA exam, would you be kind enough to share it with me? I am scheduled to take the exam this coming June,  please send it to [email protected]. Thanks!

mastak1la · Answer

hi folks, for those who have the complete questions for this PCA exam, would you be kind enough to share it with me? I am scheduled to take the exam next week, please send it to [email protected]. Thanks!

simply_groovy · Answer

hi folks, for those who have the complete questions for this PCA exam, would you be kind enough to share it with me? I am scheduled to take the exam next week, please send it to [email protected]. Thanks!

JPA210 · Answer

I think answer A is enough, but if you want a more complete solution C could be a good option: https://cloud.google.com/stackdriver/docs/managed-prometheus

CyanideX · Answer

Answer is A

thewalker · Answer

As per https://cloud.google.com/stackdriver/docs/managed-prometheus - Correct option, I feel is C.

rohen21 · Answer

Marked in Green is the real exam ans, or the community most voted one? I'm confused now hehe

[Removed] · Answer

A

A provides native solution to GCLoud
Why not C?
from GCP best practices for GKE we should rely on native logging capabilities. No need for additional solutions like Prometheus. Also it is about reviewing logs, monitoring the service, not receiving alerts each time its happens, that will not provide any insight on the issue. 
Prometheus could potentially help identify when the issue occurs, it doesn't directly help with diagnosing the root cause of the problem.

B & D rejected because migration will cause distruption.

https://cloud.google.com/blog/products/management-tools/using-logging-your-apps-running-kubernetes-engine

yas_cloud · Answer

Options A and C are less disruptive. Option C adds Prometheus on top which looks like overkills for this simple/initial level of troubleshooting. I would go with option A

Gino17m · Answer

C
1. "You currently have no logging or monitoring solution enabled on your GKE cluster" and "you have not been able to replicate the issue"- nothing interesting in GKE monitoring dashboard
2. No alerting in answer A

afsarkhan · Answer

B & D are asking to create new GKE cluster so we can ignore them
From A and C , we can achieve the requirement with the help of A itself. Monitoring & Logging should help investigate the issue.

Professional Cloud Architect Exam - Question 122

Discussion