When our cat, Mishka, stopped eating, the vet said there was very little chance it was anything other than her infected tooth driving her white blood cell count through the roof. Three months later, she died of the stomach cancer that had gone undetected.
Ordinarily I wouldn’t get so personal or morbid, but the death of my cat illustrates something we typically forget. As a culture, business and otherwise, we’ve convinced ourselves that a single backup is sufficient. And usually we’re right. In the face of singular catastrophic events, redundancy delivers reliability. In the short term, you can cover off to be as high as 100% reliability. What the death of my cat taught us is that the universe is not that predictable.
Even if you could get your business right 99.9% of the time, you’d feel pretty good about yourself, right?
In the world of service providers, 99.9% uptime means almost nine hours of unavailability a year, just over a minute a day. Is that good enough? Not for those contract calls that didn’t get through, or the factories without power, or those transit riders stuck in a dark tunnel instinctively wondering if they’ll ever see the light of day again.
Last week our SIP trunking customers experienced a disruption in service for almost an hour. An unforeseen fault in a piece of open source switch code. Averaged over the year, that’s an uptime of approximately 99.99%. Is that good enough? No. A few of our customers threatened to leave service, or at least drop us to backup status. And that’s to be expected. As unachievable as it is, perfection is in demand.
The pace of business is constant and relentless. Companies, particularly SMBs, don’t have time to slow down for interruptions. To be nimble, businesses need results they can count on. And the reality is, sometimes that means better than 99.99% uptime (under one minute of downtime a week).
Most of us, with a few notable exceptions, backup our business. No amount of planning and redundancy can completely insulate you against failure. You can be prepared to react to disaster but you cannot control your 360 degree environment. Given time, a meteor will get through. The perfection demanded by today’s users is the goal we should all strive toward, but 100% forever is an unachievable dream, the future is too uncertain.
For the most part, people will be reasonable. Customers will give you slack from time to time. But regular or repeated failure is hard to reward with loyalty.
So when you do fail, stand up, dust off, and take the steps to keep what tripped you up from ever happening again. You will never be good enough for everyone and that’s OK, that’s business. But if you fail to learn from your failures, and they happen again, soon you won’t be good enough for anyone.