Cut costs, improve efficiency. Such is the mantra of many a data center manager. While tech giants like Google and Facebook strive to create better, more energy efficient data centers, a small team of researchers from Cornell University and Microsoft have gone back in time 120 years and come up with a way to eliminate another threat to efficiency: cables.
Mathematician Arthur Cayley published a paper in 1889 called Oh the Theory of Groups – mathematical groups, that is – that was full of graphs and equations.
In 2012, those graphs and equations were used to design a wireless data center network running on a 60 GHz wireless band.
According to the paper’s abstract, the benefits — besides eliminating networking cable and switch costs — would include higher bandwidth and fault tolerance and lower latency. Adopting a spoke-and-wheel rack system for servers would facilitate communication using specially built Y-switches would help direct that traffic between racks.
This would mean a complete change in server form factor. The basic parts would remain the same – hard or solid state drive, CPU and RAM – with networking cards replaced with Y-switches. The paper goes on to mention changes to data center routing protocols, MAC layer arbitration and design schematics for the customized Y-switch. The full document is available on Cornell’s Website.
Once servers are cylindrical, it’ll be exciting to see how the buildings surrounding them change to suit. What do you think? More data centers in old missile silos?
That’s right, science has just upped the ante for cool data center locations. Forget Iceland or Oregon, now there’s a proposal to put a supercomputer on the moon.
Ouilang Chang, a graduate student at the University of Southern California, presented what he calls the Lunar Supercomputer Complexto help ease the burden of scientific “big data” processing and network traffic on terrestrial facilities. Chang says the rate of information is starting to exceed the ability of the networks to keep up. Not only that, but Chang feels the lunar site would give scientists the ability to boost the power and effectiveness of telescopes here on Earth.
In spite of the fact that this is such a vast undertaking, Chang wants to see it done in the next 10-15 years. If you consider that there’s already a plan in place to start mining asteroids, this isn’t too huge a leap.
The Lunar Supercomputer Complex would be built on the dark side far side of the moon, which would allow moderate protection against extreme temperatures and access to potential ice under the surface for cooling. The complex would include communication arrays, a data center with supercomputers, data storage, power, cooling, radiation shielding and ultimate nerd-cred for any data center architects lucky enough to draw the straw to work on the project.
The good news is they can probably save money on locks and security personnel. But would this facility ever be built?
Back in January, when Newt Gingrich proposed his moon base, people jumped on it as an “impossible dream” due to lack of available funding. But maybe if the billionaire asteroid miners take notice, the moon data center could come to fruition.
Here’s the scenario: You’re the only tech-savvy guys in a very small town, and you come across mysterious devices labeled “Pluto Switch” sitting in your distribution center. What do you do?
Post about it on the Internet, of course! And that’s exactly what happened, according to an article on Wired.com.
The gentlemen posted images of the hardware — which turned out to be a custom-built network switch — on a networking forumand tried to get it working. Many forum users offered help, wanted clearer pictures or asked to buy the switch, but in the end a little bit of sleuthing uncovered they were meant for Google.
Google builds much of its own hardware, as shown by hardware job postings on the company’s site, and further deduction revealed the town – Shelby, Iowa – was only 30 miles from one of Google’s data centers. One of the men theorized the switches were left at their facility by mistake during a delivery run.
The device was sent back to Google, the men who found it got T-shirts and Google tried to shut down the post with the images. So far, the forum hasn’t removed the post, but it has scrubbed all reference to the original posters’ identities.
Sherlock Holmes would be proud, Internet sleuths.
The last time I bought a video game, I grudgingly went to a brick and mortar store but checked the ratings on Amazon before I went. Buy local, shop global and whatnot. The last time I bought a computer – or parts for one, rather – I bought from NewEgg after scouring reviews for hours.
If you wanted to shop for enterprise software or hardware, there was no business equivalent of, say, an Amazon where you could see what other users had to say about a particular product. And if you wanted to warn other IT pros about a particular blade server that melted in its rack or you just needed to rant about XenServer, you were relegated to various forums scattered around the Web.
But a new site called IT Central Station attempts to fill that void. According to the company, the site will feature social networking, user validation and a bunch of different categories.
The site presents itself as a peer review site for IT with no vendor bias – though they’ll happily accept ads! – and as a thriving community for IT guys to bemoan terrible products or sing praises for the good ones.
I don’t know about you, but I always take online reviews with a grain of salt. I’ve seen enough padded, biased reviewers to know when I’m being conned. Are you that savvy? Do you think IT Central Station can somehow avoid the bias?
To celebrate the Labor Day weekend, here are a few light-hearted gems from the Twitterverse you might hear on the server room floor.
In response to the news that President Obama would do an “Ask me anything” session on social hub Reddit, users of the service crashed the site.
Twitter denizen Ethan Kaplan posted the following in response:
Next up is a facepalm-worthy tweet from Rob Malda, aka @cmdrtaco. Let’s hope the referenced study by Citrix was exaggerating a bit.
Trolling, for those who don’t know, is a little like playing pranks, only generally mean-spirited and performed on an undeserving target. Naigos, an open source network monitoring tool, sends lots of alerts, which can get annoying very quickly.
This one is straight from the fingertips of the venerable DevOps Borat. This little quip raises fourth generation programming languages to “old man on the mountain” level.
The CoderVersion hashtag floated its way around Twitter and left us giggling. Though there were too many to post, here’s one that might hit close to home.
Early reports on the acquisition picked up this definition of Xsigo’s I/O virtualization appliance and ran with it, lumping the Xsigo purchase in with VMware’s blockbuster acquisition last week of networking virtualization player Nicira.
Then came the social media outrage. The term “SDN-washing” was thrown around.
Quoth Joe Onisick (@jonisick) on Twitter:
“Xsigo is to SDN what McDonald’s is to fine dining.”
A little while later, Nicholas Weaver (@lynxbat) chimed in with this gem:
“Every time a tech reporter compares Xsigo to Nicira, a puppy dies.”
The IT world has had a decades-long love triangle with air- and water-cooling. Air-cooling takes IT to the prom, but now water-cooling is holding up a boom box outside IT’s window to win it back.
IBM has made so many headlines with the “world’s fastest” supercomputer, Sequoia. But it also made waves by introducing a new commercial supercomputer, the SuperMUC. It boasts direct hot-water cooling and superb energy efficiency – using 40% less energy than air-cooling, says IBM.
The PR video from the Leibniz Supercomuting Centre says the SuperMUC’s cooling system is based on the human circulatory system – a fun medicine/technology crossover. Cold water goes in directly to the processors and carries hot water out to a heat exchanger, which then heats the facility.
Apparently, the facility housing SuperMUC has successfully eliminated CRACs from the equation and is saving Leibniz a million euros a year. IBM used to cool mainframes with water, but increased processor density and cheaper air conditioning drove data centers to adopt air-cooling. Now that energy costs are on the rise and there’s an emphasis on going green, companies are once again looking to liquids to cool their machines. Plus, according to Robert McFarlane, Principal at Shen Milsom and Wilke, it’s hard to argue with the fact that “water is approximately 3,500 times more efficient than air.”
The hurdle for many facilities is infrastructure. Liquids require pipes. Even SuperMUC wouldn’t be able to use that capillary-inspired cooling system without the supporting infrastructure.
Internap, a data center hosting facility in various U.S. cities, has built the newest expansions of their facility with underfloor piping infrastructure to get glycol directly to servers. Older parts of the facility use hot/cold aisle air-cooling with the underfloor space used only for air.
Then there’s Google, which built a waste water processing facility to provide water for cooling, thus eliminating some of the strain on the community.
But both of those examples are new builds. It will be interesting to see how invasive and disruptive adding water-cooling infrastructure would be to an existing data center.
Do you think more facilities going to pony up the infrastructure cost and switch (back?) to water-cooling, or is the relative comfort of air-cooling enough to keep data centers happy?
Because speculation is fun, let’s talk a little about artificial intelligence and its potential in data centers. Automation and DCIM tools have come a long way, but as those technologies evolve, they might benefit greatly from an infusion of cutting-edge AI engineering.
The general public has seen artificial intelligence (AI) in movies and videogames where the typical scenario involves crazed robots or homicidal computers running amok. More tech-savvy consumers have Apple’s Siri in their iPhone to help fulfill a request or an adorable Roomba to vacuum a room. Once we push past the fears of a robot uprising, we realize AI can be an incredible tool to ease our workload.
The idea of AI as a functional part of a technologically advanced society isn’t remotely new. Alan Turing, the groundbreaking mathematician considered the father of AI, wrote about it back in the 1950s.
Modern uses for AI have been pioneered in many circles such as social media, board and video games, healthcare, Internet research and cat videos. And this recent story details the use of image recognition software to learn board game moves and defeat humans.
Why would any of this be useful for a data center environment? Well, let’s take automation and data center infrastructure management (DCIM) as our starting point. How great would it be if we could replace some of IT’s on-call overtime hours with AI hours? Instead of simply setting temperature or power parameters for automation software, we could have a DCIM program mimic the behavior of the human technicians to make decisions when an issue arises during the wee hours of the morning.
Automation without learning suffers from an inability to react to things outside its programming. This is one of the arguments for sending humans as well as robots to Mars. It’s not much of a leap to understand why human hours are still incredibly important for data center facilities management.
Let’s have an AI with cameras for eyes watch us work, then set it to work as a kind of “second shift” for monitoring and managing our facilities during off hours. If we want to expand into the realm of science fiction, then we can also develop a human chassis for the AI – think Stepford Wives with more coffee stained shirts – but we might be getting ahead of ourselves.
Building a new data center is a costly endeavor, as evidenced by Apple’s data center bid in Reno, Nev. But here’s a novel idea: If you don’t have the cash, build your data center out of Lego® bricks and Raspberry Pi clusters. Then install Minecraft and start computing.
That’s right, the increasingly-inventive Minecraft community has come up with several working computers built with blocks in the game. Stick enough of these puppies in your plastic data center and you might just have enough computing power to run Minecraft within Minecraft.
Silliness aside, inventing new, cheap and interesting ways to build data centers is important, especially with companies ever more budget conscious. If experimenting with games and toys is how to spur innovation, then bring on afternoon playtime.
An outage at Amazon’s Virginia data center last Friday which affected Web services including Pinterest, Netflix and Instagram was due to a multi-generator failure, the company reported Monday.
It was the second failure involving generators to hit the same region in the month of June.
While related to generators generally, the problems stem from different issues in different data centers, according to Julius Neudorfer, CTO of North American Access Technologies, Inc. But the compound failures in each case could mean that the backup systems weren’t tested in failure mode, he said.
“Clearly they’re trying to learn from every mistake,” he said of Amazon. “The common element here seems like they only tested when everything was operating rather than inducing a failure during the test.”
Amazon’s Summary of the AWS Service Event in the US East Regionreport states that during an electrical storm in the northern Virginia area June 29, two of ten data centers in Amazon’s East Region availability zone were forced by a large electrical spike to fail over to generator power.
One of these data centers did not successfully fail over to the generators because “each generator independently failed to provide stable voltage as they were brought into service. As a result, the generators did not pick up the load,” according to Amazon’s summary of the incident. Thus, servers began to run on Uninterruptible Power Supply (UPS) power instead.
As Amazon worked to stabilize the primary and backup power generators, the UPS systems were depleting and servers began losing power at 8:04pm PDT. Ten minutes later, the backup generator power was stabilized, the UPSs were restarted, and power started to be restored.The full facility had power to all racks by 8:24pm PDT, according to the Amazon statement.
The outage didn’t end there, though. A bottleneck in the EC2 recovery process and a bug in the Elastic Load Balancer control plane meant that some of the affected customers didn’t come back online until between 11:15 and 12 a.m. PDT, according to the report.
An earlier failure, on June 14, was initiated by a cable fault inside one of the East Region data centers, but then a fan inside a backup generator failed to kick on; in this instance, secondary backup power also failed, according to widespread reports.