I'm not a GitHub apologist, but that graph isn't at scale, at all. It's massively zoomed in, with a lower band of 99.5%. It makes it look far worse than it is.
If you plotted it from zero, then a horrible service and a great service would be indistinguishable. Their SLA for enterprise customers is 99.9%. The low end of that chart is 5x that amount downtime. It is a reasonable scale for the range people are concerned about and it looks bad because it is bad.
> If you started the y-axis at zero, you wouldn't see much of anything.
That's... kind of my point.
As a reliability engineer, I'm disappointed in GitHub's 99.5% availability periods, especially as they impact paying customers. On the other hand, most users are non-paying users, and a 99.5% availability for a free service seems to me to be a reasonable tradeoff relative to the potential cost of improving reliability for them.
> the other hand, most users are non-paying users, and a 99.5% availability for a free service seems to me to be a reasonable tradeoff relative to the potential cost of improving reliability for them.
If they are using your data, you're still paying just not in cash.
As a former reliability engineer, I'm trying hard to remember back when we had multiple months in a row never reaching 100% uptime, and I can't. Yes, we've seen runs of painful months, but also runs of easy months without down time.
But let's talk root cause here, the cost of improving them here, is someone caring. This isn't simply a hard problem, it's a well understood hard problem that no one who makes decisions cares about. Which as a reliability engineer is an embarrassment. Uptime is one of those foundational aspects that you can build on top of. If you're not willing to invest in something as core as your code or service works. What are you even doing?
I don't think so. Even before Microsoft acquired GitHub, you could have as many private repos as you wanted, but you couldn't have more than 3 collaborators. This change happened back in 2019: