The integration of development and operations for better outcomes.
I had the opportunity to catch up with Andi Grabner, DevOps Activist at Dynatrace during day two of Dynatrace Perform. I've know Andi for seven years and he's one of the people that has helped me understand DevOps since I began writing for DZone.
We covered several topics that I'll share in a series of posts.
How do DevOps and SRE work together?
SRE, the term itself that comes from Google. But essentially, it is, it is the automation that DevOps put into deploy for deploying changes faster by automating the pipeline. SRE is about automating the operational aspects of software. And automating the operational aspects of software means as an as an SRE, maybe five years ago, you were just calling ITOps. Now, it's called SRE, or Site Reliability Engineering.
I think both DevOps and SRE have evolved to use automation and code to automate something in a smart way and also in a codified way. Code is important, because you can source control code. You can keep history of all of your pipelines. The same is true for SRE. SRE tries to use to the same things automated through code for the operational aspects of your software.
Therefore, SRE and DevOps work really nice in tandem. I have a slide where DevOps and SRE are holding hands. They're holding hands, because in the end, it's all about automating delivery through automation. SRE really focuses more on automating the resiliency of the stuff that comes out of DevOps.
How about shift left versus shift? Is that is that an and? Or is that an and/or?
It's an "and." Shift left is really about thinking about all of these constraints earlier. Thinking early on how we deal with observability, and encouraging the developers to think about what type of data do they need in order to figure out if the system is healthy.
Traces, logs, and starting testing earlier, are the classical shifting left. Shifting right is about knowing how my system is performing. It's like knowing the heart rate of my system -- like my response time. In development, shifting right means I want to make sure the SRE team that is responsible to run my software, the time shifting, this is how you run it, this is what I want to see from an observability perspective, and these are my thresholds. If these are not met, then I want you to execute these actions from a performance, availability, and reliability perspective.
I think we always had the classical Dev and Ops divide. Development would build something and throw it over the wall. Then Operations had to figure out how to run it properly, how to scale it, and how to do capacity control.
Now, we're saying we need to look at all of these aspects much earlier. We need to figure out up front how we do observability in development, not just in operations. That's why define observability, to test it out.
We are taking all of these ingredients and identifying what we are going to observe. Let's also observe it in production. We know what the thresholds are. We know what makes our system healthy. Let's make sure we are also validating this in production. We know if something is failing in testing, what do we do to bring the system back to an ideal state. Let's codify this also in production to bring the system back in an automated way. That's my definition of shifting right.
Comments