Yottabytes: Storage and Disaster Recovery

Aug 31 2015   10:49PM GMT

Lightning, A/C Failures Take Out Data Centers

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

Tags:
Data Center
Disaster Recovery
power

Typically when we think of disasters taking out data centers, we think of hurricanes, floods, even snowstorms. But August has been particularly rough on data centers this year.

First, lightning struck the utility grid used by one of Google’s three data centers in St. Ghislain, Belgium, a small town about 50 miles southwest of Brussels, knocking the center offline and losing some data.

You know how they say lightning never strikes twice? Well, it hit the grid that Google uses four times.

Naturally, there were backup systems, but they failed too, writes Yevginey Sverdlik  in Data Center Knowledge. “Besides failover systems that switch to auxiliary power when primary power source goes offline, servers in Google data centers have on-board batteries for extra backup,” he writes. “But some of the servers failed anyway because of ‘extended or repeated battery drain,’ according to the incident report.”

The storage in question was part of the Google Compute Engine (GCE) disks, which allow customers to run cloud-based virtual machines, according to Mike Brown in the International Business Times. “It’s not the first time GCE has had issues,” Brown writes. “In February, GCE experienced a global outage that lasted for nearly two hours affecting businesses that depend on GCE for their day-to-day operations. GCE is seen as a competitor to Amazon AWS and Microsoft Azure for dominance of the cloud, but instances like these will shake consumer confidence in the GCE brand as they look for the most stable cloud services possible.”

Brown noted that “To be sure, AWS and Azure have also had their share of outages,” such as Virginia thunderstorms in 2012 that took out major Internet services such as Netflix, Pinterest, and Instagram.

Altogether, Google servers had problems for about five days, with a resultant loss of 0.000001 percent of data, Sverdlik writes. (How many bytes that is, Google didn’t say.) It’s not known which clients were affected, or what type of data was lost, according to the BBC.

Having worked in data recovery, that’s a remarkable achievement and a definite feather in Google’s bow,” commented one reader.

Google staff apparently had to tweak the servers some to retrieve data as well, wrote the company in its incident report. “In almost all cases the data was successfully committed to stable storage, although manual intervention was required in order to restore the systems to their normal serving state.” The company also pointed out that users needed to make additional copies of data in case of such incidents. “GCE instances and persistent disks within a zone exist in a single Google data center and are therefore unavoidably vulnerable to data center-scale disasters,” Google wrote, recommending GCE snapshots and Google Cloud Storage.

Next, a failed chilled water pipe caused the air conditioning system to fail in a CenturyLink data center in Weehawken, N.J. This data center provides facilities for a number of companies, including education company Pearson, Thomson Reuters, and trading companies BATS Global Markets and Investment Technology Group.

As a precaution, CenturyLink reportedly shut down its systems, meaning that the companies went offline as well. Incidentally, this was happening at the same time the stock market was tanking last week.

And this is just in August.

By the way, the hurricane season typically enters its heaviest phase on September 1. We’re already up through Fred.

1  Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.
  • hestana
    Wondeful, you always care to help every customer. 
    30 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: