Professional Cloud DevOps Engineer Exam QuestionsBrowse all questions from this exam

Professional Cloud DevOps Engineer Exam - Question 80


You support a user-facing web application. When analyzing the application's error budget over the previous six months, you notice that the application has never consumed more than 5% of its error budget in any given time window. You hold a Service Level Objective (SLO) review with business stakeholders and confirm that the SLO is set appropriately. You want your application's SLO to more closely reflect its observed reliability. What steps can you take to further that goal while balancing velocity, reliability, and business needs? (Choose two.)

Show Answer
Correct Answer: ABDE

If your application consistently consumes a very small portion of its error budget, you can afford to take more frequent or potentially risky application releases. This can accelerate feature development without compromising reliability. Announcing planned downtime to consume more of the error budget also helps to prevent users from becoming overly dependent on an exceptionally high availability that exceeds the agreed-upon SLO, aligning user expectations more closely with the SLO.

Discussion

17 comments
Sign in to comment
TronyOptions: BD
Nov 3, 2021

I would go for B+D: -A: no, there's no reason to add capacity if we are barely scratching error budget; -B: everything seems fine, so it's ok to dare with more innovative/risky releases; -C: no, stakeholders said SLO is ok; -D: adding additional SLIs (and so SLOs) might be a way to reflect observer reliability more closely; -E: put the servers down for no reason is a no-no.

Biden
Dec 13, 2021

there is no mention of innovation, only "risky"..hence not a right choices

SahandJ
Jun 19, 2024

Risk isn't necessarily bad. The SRE book specifically mentions to embrace risk. The question constraint is to "balance velocity, reliability and business needs". If the application only ever consumes 5% of its error budget, then that allows for more frequent updates (frequency and business needs). And since the application already is very reliable there is room to focus on feature development. Remember focusing too much on reliability can slow down feature development to a halt, and likewise focusing too much on feature development can cause an unreliable system.

SekiererOptions: DE
Jan 20, 2022

I vote for D+E if you read "The Global Chubby Planned Outage" https://sre.google/sre-book/service-level-objectives/

PhilipKokuOptions: BE
Feb 13, 2022

B - You can increase the frequency of your releases and take higher risks as you have never exceeded your error budget. E - Planned downtime to use some of your error budget will help to make sure end users don’t get use a higher availability of your service.

[Removed]Options: DE
Mar 26, 2022

D+E You want the application's SLO to more closely reflect it's observed reliability. The key here is error budget never goes over 5%. This means they can have additional downtime and still stay within their budget. E is correct as per Google SRE handbook (https://sre.google/sre-book/service-level-objectives/) 'You can avoid over-dependence by deliberately taking the system offline occasionally (Google’s Chubby service introduced planned outages in response to being overly available)' D is a good answer because with more SLI's, this may more accurately reflect the system's reliability. A is wrong because adding more serving capacity would make the system even more available. C is wrong because: The question states 'The SLO is set appropriately'.

eks4xOptions: BE
Dec 13, 2022

B+E B because this if you constantly have a lot of spare error budgets it is an indication that you are not taking enough risk ie releasing new features. And you are ultimately depriving the users of new functionalities by being too cautious. E: Everyone agrees on E as it was mentioned in the SRE book as part of the The Global Chubby Planned Outage Re: why not D) The review indicated that the existing SLOs are good. So adding more SLIs not useful here plus does nothing to the user perceived reliability.

JayDengOptions: BE
Dec 22, 2022

B and E. When you only consume 5% of your error budget consistently it means that you can take more risk by releasing features more often (B) and/or bring down service to set user expectation close to SLO (and business has confirmed that this SLO is appropriate)

TNT87Options: DE
Dec 27, 2021

These are the correct choices

eliCOptions: BD
Jun 17, 2022

B & D are correct.

shefaliaOptions: DE
Dec 25, 2022

This was asked on (12/24/22), passed the exam . I opted for D & E

TNT87
Dec 28, 2021

https://cloud.google.com/blog/products/management-tools/sre-error-budgets-and-maintenance-windows This is the link to the answers of this question

zygomarOptions: DE
Feb 21, 2022

chekc link from Sekierer for why E is valid (https://sre.google/sre-book/service-level-objectives/) Then D is logical as well.

Greg123123Options: BE
Dec 31, 2022

B and E: A. not relevant B. Yes because we have a lot of budget. Risky isn't necessary a negative word in SRE because what we learn from SRE is to embrace risk and failure. C. SLO is set appropriately they say. D. adding more SLI doesn't necessarily help. E. SRE practice suggest that we can have planned downtime.

Catweazle1983Options: BE
Feb 23, 2023

B is correct because when you dont use your error budget you can increase the release frequency. In the question it even mentions "balancing velocity, reliability, and business needs". The balance here can shift from reliability to velocity and business needs. D is correct as multiple other users already mentioned because of: "The Global Chubby Planned Outage" https://sre.google/sre-book/service-level-objectives/

dobby_elfOptions: DE
Aug 16, 2022

DE - You want your application's SLO to more closely reflect its observed reliability.

JonathanSJOptions: DE
Jan 16, 2023

I will go with D and E. Option B sounds good, but introducing new changes could add errors, that do not match the current objectives "You want your application's SLO to more closely reflect its observed reliability. " A doesn't make sense. C neither, because the SLO has been reviewed and its ok.

izekcOptions: BE
Mar 11, 2023

BE is correct

jomonkpOptions: BD
Dec 2, 2023

Option B and D