At least the downtime gods picked a notoriously low traffic day to punish Amazon’s S3 storage service. The darling of many web apps, including popular Twitter, was down for eight hours Sunday.
Amazon’s service provides storage and transfer of data for a small fee, so that developers can let their own servers focus on more important issues. For small upstart sites, it’s cheaper to buy their data a tiny slice at a time than to invest in a lot of hardware. S3 is especially popular for image hosting, which can be large files in comparison to trim HTML. Also, there are often many images per web page, putting extra strain on a server.
Among the revelations for some developers after a third of a day without S3 is that no single service can be counted on for 100% uptime. Of course, there’s oodles of redundancy built in to Amazon’s service. Yet, still it can go down, leaving many sites that count on it with a single point of failure.
Dave Winer sees a business opportunity in S3 redundancy:
“It would be easy to hook up an external service to S3, and for a fee, keep a mirror on another server. Then it would be a matter of redirecting domains to point at the other server when S3 goes down.”
Developers could achieve the same result Winer mentions on their own. Robert Accettura notes how WordPress.com weathered the S3 outage gracefully:
“They have (slower) back up’s in house for when S3 is down and can failover if S3 has a problem. This means they can leverage S3 to their advantage, but aren’t down because of S3.”
As many have noted, true web scalability and redundancy can be a tough sheep to sheer. While Larry Dignan questions if S3 is too complicated, I think the larger issue is that it is too simple. Too many have viewed it as a silver bullet, with Amazon doing the dirty work for them. This outage (and another back in February) has shown that S3 and services like it can help us a lot, but we still need to do our own work.