This Cloudflare outage is really something. I’ve worked alongside data center operations and I’ve seen multiple-redundant systems fail in chaotic ways like this. It’s so hard to predict. And yet, stuff like “they needed to be physically accessed … access control system was not powered by the battery backups, so it was offline” is such an obvious design flaw with hindsight.
https://blog.cloudflare.com/post-mortem-on-cloudflare-control-plane-and-analytics-outage/