Amazon finally restored the “great majority” of its EC2 customers after a failure on its Elastic Block Storage (EBS) cloud DBMS left websites in limbo last Thursday and Friday. The outage has already generated more than its share of commentary, and as usual there are more extreme views than useful ones. Some say the problem demonstrates that the cloud isn’t reliable, some that it demonstrates that the cloud is perfect. It demonstrates both and neither, in my view.
What Amazon’s problem shows is first that our view of the cloud is simplistic to the point of being dangerous, but that’s true about the popular view of just about everything these days, it seems. Second, it shows that people aren’t looking deeply enough into cloud computing when they commit to it. There’s no substitute for knowing what you’re doing.
But third, providing any form of high reliability is always harder when you’ve ceded control to a third party. Nobody at the enterprise level has any good feel for how EBS might impact reliability. Given that, all you can do is to rely on an SLA, and anyone who’s ever looked at a cloud SLA knows that in the end it’s not particularly valuable.