It was just a year ago that the Thailand flooding — only a few months after the Japan earthquake — devastated the storage industry, causing a temporary shortage of disk drives and increase in prices. But now that it’s all over, a funny story is coming out of BackBlaze, which found itself literally thinking outside the box.
The company, which is known for providing low-cost constant backups for its subscribers, is also known for building its cloud out of a whole lot of teeny (well, 3 TB) commodity disk drives rather than a few great big ones. This saves money and helps the company grow more granularly.
The only problem is if you suddenly run out of teeny commodity disk drives — or find that, in a matter of two weeks, that they’ve tripled in price, as BackBlaze did, when it was adding 50 TB of capacity a day. At the same time, the company wasn’t buying enough to be able to get deals from the manufacturers.
In an extremely detailed, hysterically funny blog post, the company is now relating how it dealt with the crisis — basically, by buying them as consumer commodities rather than as parts, and turning them into the parts they needed to build the “storage pods” on which their service was based.
“With our normal channels charging usury prices for the hard drives core to our business, we needed a miracle,” writes Andrew Klein, director of product marketing. “We got two: Costco and Best Buy. On Brian [Wilson, CTO]’s whiteboard he listed every Costco and Best Buy in the San Francisco Bay Area and then some. We would go to each location and buy as many 3 TB drives as possible.”
While the company then had to “shuck” the drives from their cases, this saved the company $100 per drive over buying them from its usual suppliers. Problem solved.
For a while.
“The “Two Drive Limit” signs started appearing in retail stores in mid-November,” Klein writes. “At first we didn’t believe them, but we quickly learned otherwise.” So workers started making the circuit — circled the San Francisco Bay hitting local Costco and Best Buy stores: 10 stores, 46 disk drives, for 212 miles. It put a lot of miles on the cars, and a lot of time, but it solved that problem.
For a while.
Then BackBlaze employees started getting banned from stores.
At that point, they started hitting up friends and family, and not just in the Bay Area, but nationwide. “It was cheaper to buy external drives at a store in Iowa and have Yev’s dad, Boris, ship them to California than it was to buy internal drives through our normal channels,” Klein writes.
(The company also apparently considered renting a moving van to drive across the country, hitting stores along the way — a variation on the “bandwidth of a station wagon of tapes” problem — but decided it wouldn’t be economical.)
By the time internal drive prices got to their normal level, the company had bought 5.5 petabytes of storage through retail channels — or more than 1800 disk drives. But finally, it could go back to its normal practices.
“On July 25th of this year, Backblaze took $5M in venture funding,” Klein writes. “At the same time, Costco was offering 3TB external drives for $129 about $30 less than we could get for internal drives. The limit was five drives per person. Needless to say, it was a deal we couldn’t refuse.”
Disclosure: I am a BackBlaze customer.
First it was HGST with helium. Now it’s Hitachi itself with glass. The company has announced a technology that enables it to store data for what it says is forever.
The technology works with a 2cm square piece of glass that’s 2mm thick, and is etched in binary with a laser. There are four layers, which results in a density of 40MB per square inch. “That’s better than a CD (which tops out at 35MB per square inch), but not nearly as good as a standard hard disk, which can encode a terabyte in the same space,” writes Sam Grobart in Bloomberg. The company said it could also add more layers for more density.
Of course, the selling point is not how dense it is, but that it will, supposedly, last forever, without the bit rot that degrades magnetic storage and is leading some to fear a “digital dark ages” where we will lose access to large swathes of our history and culture because it’s being stored magnetically.
The technology was developed in 2009 and may be made available as a product by 2015, Hitachi said, according to Broadcast Engineering.
There’s more to the digital dark ages than simply preserving the media, however — there’s also the factor of having the hardware and software that enables people to read the data. Anyone who’s found a perfectly pristine 78-rpm record in their grandparents’ attic is familiar with that problem.
Hitachi says that won’t be a problem because all computers, ultimately, store data in binary, and the glass could be read using a microscope. But how it’s encoded in binary — the translation between the binary and turning it into music or movies or whatever — the company didn’t say. The microscope could read it, but how would it know what it meant?
The way it may work is to have organizations with a great deal of data to preserve, such as governments, museums and religious organizations, send their data to Hitachi to encode it, wrote Broadcast Engineering.
The quartz glass is said to be impervious to heat — the demonstration included being baked at 1000 degrees Celsius for two hours to simulate aging — as well as to water, radiation, radio waves and most chemicals, which is why many laboratory containers are made of glass.
On the other hand, the glass is vulnerable to breakage. And as anyone who’s used a microscope has probably experienced, imagine reading the data and then, trying to improve the focus, turning the microscope too far and watching in horror as centuries-old data gets crunched.
Virtualization. In talking about how under-utilized data center servers are, and in appearing to limiting himself to less than state-of-the-art facilities, Glanz failed to notice how prevalent virtualization is becoming, which enables an organization to set up numerous “virtual servers” inside a physical server — which, in the process, results in much higher utilization. “[V]irtualized systems can be easily run at greater than 50% utilization rates, and cloud systems at greater than 70%,” writes Clive Longbottom in SearchDataCenter.
“[I]n many cases the physical “server” doesn’t even exist since everyone doing web at scale makes extensive use of virtualization, either by virtualizing at the OS level and running multiple virtual machines (in which case, yes, perhaps that one machine is bigger than a desktop, but it runs several actual server processes in it) or distributing the processing and storage at a more fine-grained level,” writes Diego Doval in his critique of the New York Times piece. “There’s no longer a 1-1 correlation between “server” and “machine,” and, increasingly, “servers” are being replaced by services.”
“Although the article mentions virtualization and the cloud as possible solutions to improve power utilization, VMware is not mentioned,” agrees Dan Woods in Forbes‘ critique of the piece. “If the reporter talked to VMware or visited their web site, he would have found massive amounts of material that documents how thousands of data centers are using virtualization to increase server utilization.”
Storage. Similarly, Glanz appeared to not be aware of advances in storage technology, even though some of them are taking place in the very data centers he lambasted in his articles. In Prineville, Ore., for example, not all that far from the Quincy, Wash., data centers he criticized, Facebook is working on designing its own storage to eliminate unnecessary parts, as well as setting up low-cost slow-access storage that is spun down most of the time.
Facebook — which does this research precisely because of the economies of scale in its massive data centers — is making similar advances in servers. Moreover, the company’s OpenCompute initiative is releasing all these advances to the computer industry in general to help it take advantage of them, too.
In addition, Glanz focused on the “spinning disks” of the storage systems, apparently not realizing that increasingly organizations like eBay are moving to solid-state “flash” storage technology that use much less power.
Also, storage just isn’t as big a deal as it used to be and as the story makes out. “A Mr Burton from EMC lets slip that the NYSE ‘produces up to 2,000 gigabytes of data per day that must be stored for years’,” reports Ian Bitterlin of Data Center Dynamics in its critique of the New York Times piece. “A big deal? No, not really, since a 2TB (2,000 gigabytes) hard-drive costs $200 – less than a Wall Street trader spends on lunch!”
Disaster recovery. Glanz also criticized data centers for redundancy — particularly their having diesel generators on-site to deal with power failures — apparently not realizing that such redundancy is necessary to make sure the data centers stay up.
And yet, even with all this redundancy, there have been a number of well-publicized data center failures in recent months caused by events as mundane as a thunderstorm. Such outages can cost up to $200,000 per hour for a single company — and a data center such as Amazon’s can service multiple companies. If anything, one might argue that the costs of downtime require more redundancy, not less.
Of course it’s important to ensure that data centers are making efficient use of power, but it’s also important to understand the context.
The only problem with HGST’s helium-filled disk drive is that any audio ends up sounding like this.
The company — formerly known as Hitachi Global Storage Technologies, and now a Western Digital company — has announced a helium-filled hard disk platform, scheduled to ship next year for an undetermined price without specifications, all of which are supposed to be announced when it ships. The technology was demonstrated at a recent Western Digital investor event.
Okay, so why helium? Said the company:
The density of helium is one-seventh that of air, delivering significant advantages to HGST’s sealed-drive platform. The lower density means dramatically less drag force acting on the spinning disk stack so that mechanical power into the motor is substantially reduced. The lower helium density also means that the fluid flow forces buffeting the disks and the arms, which position the heads over the data tracks, are substantially reduced allowing for disks to be placed closer together (i.e., seven disks in the same enclosure) and to place data tracks closer together (i.e., allowing continued scaling in data density). The lower shear forces and more efficient thermal conduction of helium also mean the drive will run cooler and will emit less acoustic noise.
That’s seven platters as opposed to the current five, though HGST didn’t specify how much more dense the data could be nor would this could mean in terms of improved disk capacity. However, storage analyst Tom Coughlin wrote in Forbes that this means “HGST could ship close to 6 TB drives in 2013 and even 10 TB drives with 7 platters could be possible within two years after that.”
The company did say, however, that the helium-filled drive used 23 percent less power, for a 45 percent improvement in watts-per-TB. In addition to consuming less power, the drive operates four degrees Celsius cooler, requiring less cooling in the system rack and data center, the company said.
HGST has been working on the technology — the operative part of which is designing a leakproof case — for six years, before Western Digital bought it in March, 2011, and took possession in March, 2012.
What the companies didn’t mention, however, is how they might deal with a worldwide shortage of helium that is causing a ballooning of the price, literally — helium balloons now cost three times as much as they did just six months ago. As it turns out, the gas is heavily used in the computer industry.
“Helium is usually generated as a byproduct of natural gas mining, and we’re currently in the middle of a shortage of helium, due partly because the recession has slowed natural gas production,” wrote Brad Tuttle in Time. “About three-quarters of the world’s helium is produced in the U.S., according to the Kansas City Star, and while production is supposed to be increased by the end of the year in spots ranging from Wyoming to Russia, the element is expected to be in short supply for months, if not years.”
OMG. Hold the presses. In a shocking power grab, EMC CEO fought off attacks by underlings to maintain his position.
No, not really.
Tucci had announced a year ago that he planned to step down from EMC (as well as VMware, of which it owns a majority) by the end of this year. (In fact, the Boston Globe suggested that he had first announced his retirement in September 2009.) He then announced in January that, never mind, he was going to stay through 2013.
While there has been some executive reshuffling since then, on the whole it appears to be an orderly transition, with several potential competent successors.
Now Tucci says he’s going to stay through at least February 2015, and at some time before that he’s supposed to pick a successor and transition to a purely chairman of the board role.
Roger Cox, vice president of research for Gartner Inc., told the Globe that Tucci’s decision to stay longer is probably more about his unwillingness to let go than dissatisfaction by the EMC board with potential successors, of which there are at least three internal ones. While Tucci is 65, he is reportedly in good health and the company is doing well — so well that perhaps the board and stockholders are leery about turning the company over to someone else, no matter how well-groomed they are for the position. And perhaps he is hoping that one or more of the three will move on and make the decision easier.
EMC’s orderly transition is in sharp contrast to the traumatic ones in other companies such as HP, notes Channelnomics.
Oh, and should Tucci achieve “certain performance targets, including targets relating to total shareholder return, revenue and other metrics” for 2013 and 2014, he also stands to gain $8 million in stock by the February 2015 deadline.
If you needed a reason to implement e-discovery in your company, you now have one. 1.05 billion of them, in fact.
A number of legal experts — as well as e-discovery vendors — have pointed to discovery of electronic documents such as email as an important factor in Apple’s patent victory over Samsung. Writes Doug Austin in E-Discovery Daily:
Interviewed after the trial, some of the jurors cited video testimony from Samsung executives and internal emails as key to the verdict. Jury foreman Velvin Hogan indicated that video testimony from Samsung executives made it “absolutely” clear the infringement was done on purpose. Another juror, Manuel Ilagan, said , “The e-mails that went back and forth from Samsung execs about the Apple features that they should incorporate into their devices was pretty damning to me.”
E-discovery vendors, such as Jeffrey Hartman of EDiscovery Labs, were quick to pounce on the case as an example.
This is yet another clear reminder that otherwise smart people continue to create electronic documents that are both dangerous and discoverable; even as awareness of these pitfalls increases. This is bad news for general counsels and company shareholders…but good news for plaintiff’s attorneys seeking the digital goodies that will help them win lawsuits. A large courtroom display of a blow-up of an emotionally charged internal report or email is often worth even more than technical testimony or other hard evidence.
Another important e-discovery aspect to the case is that first Samsung, and then Apple as well, were hit with “spoilation” charges for failing to preserve electronic evidence — in the case of Samsung, for example, for failing to turn off a function that automatically deletes email that’s more than two weeks old. While a number of e-discovery experts do recommend implementing such an autodelete feature, you have to turn it off once a case starts to preserve evidence that could be useful to the case, known as a “litigation hold.”
There’s a compilation of articles about the case if you want to read more — seriously, a lot more — about this.
For the second time this year, and the third time since 2006, MD Anderson Cancer Center in Texas has had to alert patients that it had lost access to their personal data.
“On July 14, 2012, MD Anderson learned that on July 13 a trainee lost an unencrypted portable hard drive (a USB thumb drive) on an MD Anderson employee shuttle bus,” the company reported earlier this month. “We immediately began a search for the device and conducted a thorough investigation. Unfortunately, the USB thumb drive has not been located.” In the thank goodness for small favors department, the data did not include Social Security numbers. Some 2,200 patients were affected.
Similarly, on June 28, the company announced a previous breach. “On April 30, 2012, an unencrypted laptop computer was stolen from an MD Anderson faculty member’s home. The faculty member notified the local police department. MD Anderson was alerted to the theft on May 1, and immediately began a thorough investigation to determine what information was contained on the laptop. After a detailed review with outside forensics experts, we have confirmed that the laptop may have contained some of our patients’ personal information, including patients’ names, medical record numbers, treatment and/or research information, and in some instances Social Security numbers. We have no reason to believe that the laptop was stolen for the information that it contained. We have been working with law enforcement, but to date the laptop has not been located.” Another 30,000 patient notifications.
This follows a 2006 notification incident where private health information and Social Security numbers of nearly 4,000 patients of were at risk after a laptop containing their insurance claims was stolen the previous November at the Atlanta home of an employee of PricewaterhouseCoopers, an accounting firm reviewing the patient claims.
Security experts were unsympathetic.
“Wow, is that dumb,” international cyber security expert Bruce Schneier told the Houston Chronicle. “This isn’t complicated. This is kindergarten cryptography. And they didn’t do it. I’d be embarrassed if I were them. Of course, it’s not them whose privacy could be violated. It’s the innocent patients who trusted them. To be fair,” he said in an email, “the drive could simply be lost and will never be recovered. We don’t know that patient information was leaked. But it’s still shockingly stupid of the hospital.”
The center said it was beginning a several-month plan to encrypt all the computers at the hospital, and that 26,000 had been encrypted thus far. The hospital has also ordered 5,000 encrypted thumb drives. In addition, employees will receive training on thumb drives and security.
If nothing else, at least MD Anderson is apparently in good company. “According to a records search of the Privacy Rights Clearinghouse, which keeps a running tab on data breaches and the like, so far this year 387 357 medical-related records have been compromised in 68 reported incidents involving lost, discarded or stolen laptop, PDA, smartphone, portable memory device, CD, hard drive, data tape, etc,” writes IEEE Spectrum. “Last year there were 66 such breaches with 6 130 630 records compromised.”
This week was supposed to be Mark Durcan’s last. In late January, he’d announced his retirement from Micron, the U.S.’s only memory chip maker and second largest worldwide after Samsung.
Instead, the following week, CEO Steve Appleton died in a plane crash, and Durcan agreed to take over the CEO role at the company where he’d worked since 1984.
Speaking before the City Club in Boise, Idaho, where Micron is based, Durcan talked about his first six months on the job and where Micron is going.
There used to be 40 memory producers in the world, and now there are only 9, Durcan said. How did Micron manage to be one of them, especially continuing to be based in the U.S.? By focusing on using technology, and being clever on using capital and partnering, he said. In particular, the company was careful not to run out of cash, which is the downfall of many companies, he said.
Micron, which has received a number of tax breaks from Idaho to encourage it to stay in the state where it is one of the largest employers, has come under some criticism for moving jobs overseas, but Durcan denied that, saying that while it does have a number of overseas facilities, they were primarily through acquisition rather than through development.
The company is currently in the process of acquiring Elpida, a Japanese company focused on low-power DRAM and mobile DRAM that is going through bankruptcy. This is actually delaying the acquisition to some extent, Durcan says. “Japan is working through the process,” he says, because there isn’t much bankruptcy there.
Currently, the worldwide market for memory is $345 billion, and of that, Micron earns $33.4 billion of that in DRAM and $35.9 billion in flash memory, Durcan said. 68% of its revenue comes from Asia, 21% from America, and 11% from Europe, he said. Solid-state drives (SSD) provide 10% of the company’s revenue, while mobile provides 17% and is likely to increase after the Elpida acquisition is finalized.
To help, Micron is partnering with other firms such as Intel, and he expects that in the future, the company is going to become even more dependent on partnerships, including with its customers, Durcan says.
Durcan also said he expected the company to move up the value chain to include controllers and processing, and sell systems rather than just silicon. “The cloud is a huge opportunity for us,” he noted, both because people are increasingly gaining access to it through smartphones and tablets, to which Micron contributes about 40% and 10% respectively, and because cloud infrastructure is increasingly making use of SSD to improve performance. SSD itself in the enterprise is also expected to be a major factor, as the company has shipped 2 million client SSDs but they make up only .3% of enterprise storage, he said. In other innovations, the company is also known for its hybrid memory cube technology.
Asked about his reaction to the Apple-Samsung lawsuit, Durcan said he “didn’t really have a horse in the race” because both of them were Micron customers. He noted, however, that part of the reason Apple won is through the similar design of the smartphone families. “It’s easier for the public to understand design than technology,” he said.
Durcan didn’t say whether he’d rescheduled his retirement.
Delayed-retrieval low-cost storage is suddenly cool.
In both cases, the vendors are offering low-cost storage for long-term archiving in return for customers being willing to wait several hours to retrieve their data — though, in Facebook’s case, the customer appears to be primarily itself, at least for the time being.
“To keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable,” says Amazon. “With Amazon Glacier, customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month.”
A penny per gigabyte equals $10 per terabyte (1,000 gigabytes) — compared with $79.99 for the cheapest 1-TB external drive from Amazon’s product search, while Dropbox’s 1-TB plan costs $795 annually, notes Law.com.
The service is intended not for the typical consumer, but for people who are already using Amazon’s Web Services (AWS) cloud service. Amazon describes typical use cases as offsite enterprise information archiving for regulatory purposes, archiving large volumes of data such as media or scientific data, digital preservation, or replacement of tape libraries.
“If you’re not an Iron Mountain customer, this product probably isn’t for you,” notes one online commenter who claimed to have worked on the product. “It wasn’t built to back up your family photos and music collection.”
The service isn’t intended to replace Amazon’s S3 storage service, but to supplement it, the company says. “Use Amazon S3 if you need low latency or frequent access to your data,” Amazon says. “Use Amazon Glacier if low storage cost is paramount, your data is rarely retrieved, and data retrieval times of several hours are acceptable.” In addition, Amazon S3 will introduce an option that will allow customers to move data between Amazon S3 and Amazon Glacier based on data lifecycle policies, the company says.
There is also some concern about the cost to retrieve data, particularly because the formula for calculating it is somewhat complicated.
While there is no limit to the total amount of data that can be stored in Amazon Glacier, individual archives are limited to a maximum size of 40 terabytes and up to 1000 “vaults” of data, Amazon says.
While it doesn’t deal with the issue of data for software that no longer exists, the Glacier service could help users circumvent the problem of the “digital dark ages” of data being stored in a format that is no longer readable, notes GigaOm.
Can similar services for other cloud products, such as Microsoft’s Azure, or for consumers, be far behind?
Remember when Facebook started designing its own servers and data center? And then its own disk drives?
Now it’s designing its own archival backup.
The story, broken by Robert McMillan at Wired, is that the company is, over the next six to nine months, working to design a storage archive system. Because it stores a second copy of data and is intended to be used only for restores, the system powers down the drives when not in use. Such technology could reduce power use by the data center to one-third, according to the Facebook spokesman quoted by Wired.
More generally, Facebook has been working on what it calls the Open Compute Initiative, which basically means that it is designing new, minimalist hardware for standard functions that — due to the enormous scale of the company’s hardware — saves space, money, energy, and so on. The intention, once the design is complete, is to open source the data and offer the designs to the industry.
It isn’t clear whether this method of archival storage is also going to be open-sourced, according to the Verge. However, Facebook has been talking about the notion of drives that spin down when not in use — what it calls a “hard drive thermostat” — for almost exactly a year in connection with the Open Compute project.
Storage that is saved but rarely used is called “cold” storage, so the proposed building, part of the Facebook data center complex in Prineville, Ore., is nicknamed Sub-Zero, presumably after the line of high-end refrigerators. The company is also considering building a similar facility as part of its Forest City, N.C., data center.
It’s important with such systems to ensure that the data on them really isn’t used very much, because it can take up to 30 seconds for the disk to start from zero, and up to 15 seconds from the slower speed.