I work for a company that has operated like this for 20 years. The system goes down sometimes, but we can fix it in less than an hour. At worst the users get a longer coffee break.
A single click in the software can often generate 500 SQL queries, so if you go from 0.05 ms to 1 ms latency you add half a second to clicks in the UI and that would piss our users off.
Definitely not saying this is the best way to operate at all times. But SQL has a huge problem with false dependencies between queries and API:s that make it very difficult to pipeline queries, so my experience has been that I/O-bound applications easily become extremely sensitive to latency.
Most businesses dont require that kind of uptime though. If i killed or servers for a couple of hours between 02:00 and 04:00 every night probably nobody would notice for at least a year if it wasn’t for the alerts we’d get.
I work for a company that has operated like this for 20 years. The system goes down sometimes, but we can fix it in less than an hour. At worst the users get a longer coffee break.
A single click in the software can often generate 500 SQL queries, so if you go from 0.05 ms to 1 ms latency you add half a second to clicks in the UI and that would piss our users off.
Definitely not saying this is the best way to operate at all times. But SQL has a huge problem with false dependencies between queries and API:s that make it very difficult to pipeline queries, so my experience has been that I/O-bound applications easily become extremely sensitive to latency.
I’m going to guess quite a people here work on businesses where “sometimes breaks, but fixed in less than an hour” isn’t good enough for reliability.
Yeah if you need even 99.9% uptime, the most downtime you can accept in a year is eight hours.
Most businesses dont require that kind of uptime though. If i killed or servers for a couple of hours between 02:00 and 04:00 every night probably nobody would notice for at least a year if it wasn’t for the alerts we’d get.