About SRE – and how (not) to apply it
In short, “SRE is what happens when you ask a software engineer to design an operations team.” Let's dive a bit deeper into what SRE (or “Site Reliability Engineering”) means – and then take a step back. SRE was originally created internally at Google to solve Google's challenges of running Google's production systems at Google scale. How can that even work for you, as you are not Google? It won't! Unless you know how to apply Google's lessons to your possibly very different organization. Following SoundCloud's five-year mission to a reliable site should give you plenty of inspiration.
Level: Non technical / For everyone
Björn Rabenstein is a Production Engineer at SoundCloud and a Prometheus developer. Previously, Björn was a Site Reliability Engineer at Google and a number cruncher for science.