Yottabytes: Storage and Disaster Recovery

April 19, 2012  9:39 AM

Programmer’s ‘Prince of Persia’ Story Exemplifies Danger of ‘Digital Dark Ages’

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

Jordan Mechner, the designer of the game Prince of Persia (which went on to be a movie), recently wrote a blog post describing the day-long ordeal he and at least three other guys had trying to get copies of the original source code for his game from some Apple ][ disks.

Mechner and his team had to deal with multiple possible problems:

  • Finding a drive to read the disk
  • Finding software to read the disk
  • Dealing with whatever forms of copy protection the disk might have had
  • Finding software to run the software on the disk
  • Dealing with whatever damage the disk itself might have suffered during its 22 years in his dad’s garage
  • Dealing with whatever “bit rot” the data might have suffered
The thing is, Mechner’s situation is one that we’ll all be dealing with, whether it’s our digital pictures or, worse, our own history. Since the late 1990s, archivists have expressed concern about digital preservation and an upcoming “Digital Dark Ages,” when historical material available only on computer media will be unreadable to future generations.
Writes Mechner:

Try popping your old 1980s VHS and Hi-8 home movies into a player (if you can find one). Odds are at least some of them will be visibly degraded or downwright unplayable. Digital photos I burned onto DVD or backed up onto Zip disks or external hard drives just ten years ago are hit and miss — assuming I still have the hardware to read them.

Whereas my parents’ Super 8 home movies from the 1960s, and my grandparents’ photos from the 1930s, are still completely usable and will probably remain so fifty years from now.

Pretty much anything on paper or film, if you pop it in a cardboard box and forget about for a few decades, the people of the future will still be able to figure out what it is, or was. Not so with digital media. Operating systems and data formats change every few years, along with the size and shape of the thingy and the thing you need to plug it into. Skip a few updates in a row, and you’re quickly in the territory where special equipment and expertise are needed to recover your data. Add to that the fact that magnetic media degrade with time, a single hard knock or scratch can render a hard drive or floppy disk unreadable, and suddenly the analog media of the past start to look remarkably durable.

As an example, writes Science Daily, “Magnetic tape, which stores most of the world’s computer backups, can degrade within a decade. According to the National Archives Web site by the mid-1970s, only two machines could read the data from the 1960 U.S. Census: One was in Japan, the other in the Smithsonian Institution. Some of the data collected from NASA’s 1976 Viking landing on Mars is unreadable and lost forever.”

And that’s just accidental damage. There’s also the issue of potentially embarrasing data deliberately being destroyed.

Similarly, though companies such as Microsoft are working with organizations such as Britain’s National Archives to help preserve their data, it’s the proprietary nature of software from exactly such companies — Word and Outlook, for example — that is contributing to the problem, critics say.

Think of how many early movies and television programs are no longer available because the film deteriorated (in some cases actually spontaneously combusting) or were thrown out.

Organizations such as the Internet Archive, the Library of Congress, and the Long Now are working to help preserve data access, but that doesn’t necessarily help us as individuals. For that, digital archivist Jason Scott, who helped Mechner with his project, recommends the following: “If you have data you want to keep for posterity, follow the Russian doll approach. Back up your old 20GB hard drives into a folder on your new 200GB hard drive. Next year, back up your 200GB hard drive into a folder on your new 1TB hard drive. And so on into the future.”

That won’t necessarily solve the problem of having software that can read the data, but at least the data itself will be intact. (This is something I did a few months back when I reorganized my office — collected all the random CDs, DVDs, Zip drives, thumb drives, and 3 1/2-inch floppies cluttering up my office, and put them on my new 2-TB NAS drive.)

Mechner ends with a warning. “From a preservationist point of view, the POP source code slipped through a window that is rapidly closing. Anyone who turns up a 1980s disk archive 20 or 30 years from now may be out of luck. Even if it’s something valuable that the world really cares about and is willing to invest time and money into extracting, it will probably be too late.”

April 14, 2012  11:14 PM

How to Leverage a Disaster Without Looking Like an Opportunistic %*^$

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

One of the toughest jobs out there must be marketing director for a disaster recovery product or service. There’s no better time to promote one’s product or service than when there’s just been a disaster, yet doing so makes you look like you’re exploiting people’s tragedy and can backfire.

Take for example Microsoft, which came under a barrage of criticism during the Japan earthquake last year for offering to donate a dollar for every retweet of its message promoting its Bing search engine; after being attacked, the company swiftly backpedaled and just made a straight donation, no retweeting required.

That’s why I hesitated at posting about this upcoming vendor — my initial reaction was negative, and it’s only several days later, after checking out the coverage elsewhere, that I can look at its announcement more objectively.

With Tornado Alley this spring looking more like Tornado Interstate, and numerous regions and businesses affected, it’s not surprising that some vendors would want to use it as a news hook — though perhaps waiting until the twisters had actually stopped forming might have been more tasteful timing.

In response to the severe damage caused by tornadoes touching down in the Dallas area, Nirvanix, the leading provider of enterprise-class cloud storage services, today announced that it is expanding its Disaster Avoidance Program to customers currently storing data in its Node 3 data center in Dallas enabling them to exercise the option of moving their data to other locations in the Nirvanix Cloud Storage Network—either on a temporary or full-time basis—free of charge.”

(Ironically, the Storage Networking World show was being held in Dallas at the same time and was itself disrupted by the severe weather, although Nirvanix did not appear to attend that event.)

On its face, this is a reasonable offer. Users in a disaster area can store their data outside the area. Great. So what’s the problem?

Perhaps it’s the use of the phrase “the leading provider of enterprise-class cloud storage services” in the first sentence. Really, did Dallas people need to have that pointed out to them just then?

Perhaps it’s the Johnny-on-the-spot nature of the announcement, which was issued the same day the tornadoes actually occurred. By Googling, one can ascertain that such an announcement is not unusual for Nirvanix, with the company making similar offers during disasters such as the Japan earthquake and Hurricane Irene.  Pull out the boilerplate press release, drop in the name of the disaster and its location, and you’re good to go.

One does wonder at what point the trigger occurs to send out such a release. After a certain amount of property damage occurs or a certain number of people are killed? Does it depend on how many customers Nirvanix has in the affected area? Will Nirvanix issue a similar press release and offer this week regarding the Midwest tornadoes, or did they not come up to snuff?

While some disasters, such as the Japan earthquake, are unpredictable, it’s no secret to anybody that we have tornadoes in the spring and hurricanes in the fall. If one wants to offer such a service to one’s clients, how about issuing a generic press release at the beginning of the disaster seasons so that it looks less like a vendor exploiting a particular tragedy? The anniversary of the Great San Francisco earthquake is coming up, too; that might be a marketing opportunity as well that is far enough removed from actual events and tragedies that it won’t appear so opportunistic.

April 6, 2012  10:22 PM

How Did You Celebrate World Backup Day?

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

I know I’ll never forget the heartwarming family traditions or the look on my daughter’s little face on the morning of World Backup Day.

Just kidding. Actually, it was last Saturday, and I didn’t even hear about it til a day or so afterwards. It was, in fact, only the second time the holiday had been celebrated.

As it happens, World Backup Day came into being from a reddit discussion a year ago.

I just think it would be for the good of everyone to have a reminder to save all your cherished pictures, videos and other important data to somewhere secure.

Companies should also get involved, making sure that their customers and their own data is secure and safe. Maybe even the back-up providers could offer discounts and rates based on the date to encourage sales and participation.

Why March 31? The theory was to have your computer all backed up in case there were tricks or viruses associated with April Fool’s Day. There’s now a web page and a Facebook page, as well as a Twitter feed that seems to look for people mentioning hard drive failures and then asks brightly whether they’d remembered to do a backup first — safe out of punching range.

Not surprisingly, backup vendors have jumped on the notion of World Backup Day, with — just as the original poster suggested — discounts and suchlike to encourage people to back up their data, as well as several helpful infographics and even Pinterest sites talking about the scourge of data loss. The holiday is also starting to make it to the mainstream media, and user organizations such as Lawrence Berkeley National Laboratory picked it up as well.

All kidding aside, it’s not a bad mnemonic idea, on the order of changing the batteries in your smoke detector during the switches to and from Daylight Savings Time. (By the way, when do people in Indiana and Arizona change their smoke alarm batteries, if those states don’t observe Daylight Savings Time?) Anything that encourages consumers to do backups is probably a good thing, though an annual backup probably isn’t that much help.

Unlike some holidays such as National Telework Week, which asks people to pledge to work at home and then calculates the hours they worked and the savings they made, World Backup Day doesn’t do any followup, so we don’t actually know how many people observed World Backup Day and from how many data losses we were saved. Perhaps that’s an idea for World Backup Day #3.

March 31, 2012  3:22 PM

U.S. Border Laptop Search Can Be Challenged

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

It hasn’t gotten a lot of play in the news media, but a recent U.S. District Court decision may at least weaken a policy that theoretically gives the Department of Homeland Security the right to search laptop storage of more than two-thirds of Americans.

In case you’ve forgotten, in August 2009, the U.S. government implemented a new policy for the Department of Homeland Security giving the department the right to search laptops in border areas. The problem was, according to Udi Ofer, Advocacy Director for the New York Civil Liberties Union, in a letter he wrote to the New York Times in August, 2010, Border Patrol agents have the right to conduct such seizures within 100 miles of the U.S. border, which covers much more of the United States than it sounds. In fact, two-thirds of the population of the U.S. lives in one of those areas, he wrote — and people in those areas could be subject to losing their laptops. (Indeed, the Ninth Circuit Court ruled that such laptops could be transported more than 100 miles away to do a more thorough search.)

In a particular case filed last May, the U.S. government was charged with targeting David House, a Massachusetts programmer, due to his association with Bradley Manning, the soldier accused to leaking material to WikiLeaks, for one of these searches. The American Civil Liberties Union and ACLU of Massachusetts had filed suit against the government for this, which the government moved to dismiss.

The 27-page court decision this week denied the government’s motion, meaning that the lawsuit against the government can continue to take place. Moreover, although the judge supported the government’s right to search laptops at the border, he did put some sideboards on that right, such as:

  • Not allowing laptops and other equipment to be seized for an indefinite period of time (House’s were seized for seven weeks)
  • Not allowing people to be targeted for First Amendment-protected political speech (it has been suggested that House was targeted due to his association with Manning)

This doesn’t eliminate the searches — which also have criminal defense attorneys concerned, due to loss of attorney-client privilege, not to mention students with majors in Islamic Studies — but this and some other lawsuits challenging the policy give hope that it may be modified in the future.

March 29, 2012  3:40 PM

Romney Campaign Apparently Planned Ahead for Stolen Laptops

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

We’ve heard this sort of story before (here, and here, and here):  While members of the Mitt Romney presidential campaign staff were at dinner in San Diego, their parked SUV was broken into, leading to the loss of two laptops, two iPads, and two two-way radios.

The conversation quickly turned to whether this was a random act of burglary or a targeted political action. 10News in San Diego, which broke the story, interviewed local political analyst Carl Luna.

“This could just be a coincidence,” said political analyst Carl Luna. “Then again, given this campaign season and how negative it’s been, dirty tricks are not alien to American politics. Best case scenario for the Romney camp… these things are going to be sold at a swap meet on the side. Worst case scenario… some of this stuff makes it onto the Internet and if somebody could spin it against them they might.”

But unlike most of these stories, this one has a happy ending, or, at least, not an unhappy one. While the aides said the theft was a bummer in that they had to replace the equipment — not to mention their clothes and other items that were stolen — they weren’t worried about the theft of information, apparently because the equipment was reportedly all remotely erased, according to the UT San Diego newspaper.

“My understanding is once they found out they called people and had everything shut down,” [Detective Gary] Hassen told the paper.

Of course, we shouldn’t be surprised by Romney’s perspicacity. This was, after all, the administration that had the foresight to buy all its hard drives when it left office, depriving later generations — as well as opponents and journalists — from researching his time in the Massachusetts’ Governor’s office.

Maybe the campaign was just planning ahead.

March 25, 2012  11:06 PM

File Sharing Site RapidShare Required to Monitor All Content

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

You thought the MegaUpload seizure was bad? Check this out. The Swiss file uploading service RapidShare, accused of harboring copyrighted content, has been ordered by a German court to monitor all uploads to its service — which amount to thousands per day.

While it was agreed that RapidShare had copyrighted content, the company had been working on its reputation, clearing its name in a single year and then throttling downloads to its free users to discourage content distribution, because people who were pirating content didn’t want to sign up for the premium service and use their name. But after the MegaUpload seizure, pirates had reportedly started turning to RapidShare, which resulted in the slowdown to 30 Kbps. According to the Village Voice,

“RapidShare has been faced with a severe increase in free user traffic and unfortunately also in the amount of abuse of our service ever since, suggesting that quite a few copyright infringers have chosen RapidShare as their new hoster of choice for their illegal activities. We have thus decided to take a painful yet effective step: to reduce the download speed for free users. We are confident that this will make RapidShare very unpopular amongst pirates and thus drive the abusive traffic away.”

Unfortunately for RapidShare, that wasn’t enough, leading to the German court’s decision.

Another interesting aspect is that the European Court of Justice had found earlier this year that networks couldn’t be forced to install an anti-piracy filter, saying that the privacy of users was more important than protecting copyright. How this will affect the implementation of the order isn’t clear.

What remains to be seen is how RapidShare is going to comply with the court order. The number of files uploaded on a daily basis probably exceeds the number that can be monitored manually, meaning that software will likely need to be developed to scan for copyrighted content. But such software isn’t foolproof; in 2009, in response to a takedown request, file sharing site Hotfile gave Warner Brothers such a tool to find its content, with the result that a great deal of material that Warner Brothers didn’t own was incorrectly tagged and removed. And as with MegaUpload, legitimate users stand to be inconvenienced.

March 19, 2012  12:14 AM

What SMB Data Backup and Rush Limbaugh Have in Common

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

It’s funny sometimes how a perfectly ordinary press release can have a lot more to it than appears at first.

Take Carbonite (NASDAQ:CARB). The company issued a press release a few days ago citing a study finding that many small businesses were using old, unreliable methods such as external hard disks, USB drives, and CD ROMs with which to back up their data. The report noted the following:

  • 50% use external hard drives, yet 20% backing up their business data indicated they started to do so because of a hard drive failure
  • 42% use USB/flash drives primarily because it is perceived as easy, yet only 6% believe USB/flash drives to actually be reliable
  • More than one-third use CDs/DVD drives to back up data, even though 62% feel they are inconvenient or risky
  • 21% of small businesses using online backup were using a free product; since free online backup services are typically capped at two gigabytes, small businesses using these methods could be vulnerable to data loss
  • 24% of small businesses using this method noted USB/flash drives do not work well for backup specifically because they have limited storage space
  • 22% of small businesses surveyed pay for outside tech assistance
  • 40% of those who manage the process in-house spend more than an hour per week backing up their company data, with 6% spending more than five hours per week
  • Only 24% have backed up their data in the past day, and 24% haven’t backed up their data within the past week

Gosh. Sounds serious.

If one reads further, however, one notes two things. First of all, by an amazing coincidence, Carbonite just happens to sell a service, at what is no doubt a reasonable price, that solves all these problems.

Second of all, there is absolutely no information in the press release about the study itself, other than its name: Carbonite Small Business Data Backup Usage Study, July 2011. Nothing about how many people were surveyed, how they were chosen, or anything. For any vendor survey, this tends to cast suspicion on its results.

Not to mention, July? Really?

If one uses one’s favorite search engine to search for the title of said study, one discovers that Carbonite has in fact referenced the same study in three other press releases, in July, October, and November. It’s in the July one that we learn that the survey itself on which the study was based was actually performed in April. 2011.

That said, several outlets, including no less than eWeek, picked up the survey and ran it as a straight news story.

But Carbonite, which went public last summer, was in the news for something else recently. In response to the Rush Limbaugh lambasting of Sandra Fluke as a “slut” for implying that she actually, gasp, had sex, Carbonite pulled its advertising on March 3 from the conservative radio show — one of some 40 radio talk shows on which it advertises, according to a blog post from the company president.

There have been two results from that. First, Carbonite has been slagged by any number of sites in the right-wing echo chamber, as well as on its own Facebook page, for daring to question Rush — not to mention, as it turns out, because the company CEO had donated money to left-wing candidates and causes. Second, the company’s stock dropped some 10% in a day, from which it is slowly — very slowly — recovering.

So, did the company issue yet another press release on the same July study — now with data nearly a year old — to deflect interest from the Rush flap?

March 14, 2012  5:10 PM

Western Digital Finally Closes Hitachi GST Purchase

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

Almost exactly a year after it was first announced, Western Digital has announced that it has closed its purchase of Hitachi GST, after being required to sell off a portion of the business to satisfy the FTC.

Western Digital announced the deal on March 7, 2011, and said it expected it to close in September of that year. It seems to have slipped a bit. In the meantime, Hitachi GST changed its name to Viviti Technologies Ltd.

Western Digital said the acquisition cost $3.9 billion in cash and 25 million shares of WDC common stock valued at approximately $0.9 billion, in comparison to the original deal of $3.5 billion in cash and $750 million in stock. Hitachi, Ltd. now owns approximately 10% of WDC shares outstanding, and it has the right to designate two individuals to the board of directors, the company said.

For anticompetitive reasons, the Federal Trade Commission required that Western Digital sell assets to Toshiba Corp. that Hitachi uses to make and sell desktop hard- disk drives, according to Bloomberg. The European Commission had also required Western Digital to sell one of Viviti’s 3.5-inch manufacturing plants and associated intellectual property for making these drives. In return, Western Digital received a Toshiba plant that had been damaged in last year’s Thai floods. Chinese regulators also required the two companies to remain separate entities for two years.

This is all after Seagate bought Samsung storage last April and Toshiba bought Fujitsu storage in February 2009. And imagine, some people think the storage industry is boring.

So what have we got here?” summarizes Chris Mellor of Register UK. “We have a 5-player industry featuring Hitachi GST, Samsung, Seagate, Toshiba and Western Digital shrinking to three over an (at least) two year period. Seagate is buying Samsung but has to operate it at arms length for one year due to Chinese conditions. WD is buying Hitachi GST but has a two year limbo before it can apply to the Chinese guy to formally integrate its two subsidiaries. Toshiba is getting two legs up into the 3.5-inch disk drive business by getting Hitachi GST’s disk production and some off-loaded WD production too. It is, in manufacturing capacity and HDD technology terms, an unanticipated gainer from the WD-HGST acquisition. Furthermore, because it has its own flash foundry, unlike either Seagate or WD, it is arguably well-placed to add flash caches to its disk drives.”

Combining the production volume of Seagate and Samsung and Western Digital and Vivinti (HGST), in CQ4 2011 market share would have been 47% Seagate Technology, 37% Western Digital and 16% Toshiba, according to storage analyst Tom Coughlin. At the time of the announcement, Western Digital held about 31% of the hard disk drive market, followed by Seagate Technology with 29%. Hitachi had about 18%, wrote Grant Gross of IDG News Service.

March 1, 2012  12:13 AM

Judge Rules that Legal Firms Can Use Computer-Based ‘Predictive Coding’ in E-Discovery

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

In a decision that may be as far-reaching as the 2006 changes in rules for civil proceedings that essentially created the e-discovery market, Southern District of New York Magistrate Judge Andrew Peck has issued a ruling that litigants may (that word is important) use computer-assisted review software that uses “predictive technology” software to help determine the relevance of documents.

Ironically, this all happens almost exactly a year after the New York Times published an article on the subject, which though it didn’t use the term “predictive coding” described the practice and its effect on the legal community. Studies have also found that computer programs are better at it than legal staff.

The “may” is important for two reasons. The first is that, due to some confusion, some people believed that Peck’s ruling, in the case of Monique Da Silva Moore, et al., Plaintiffs, v. Publicis Groupe & MSL Group, Defendents, 11 Civ. 1279 (ALC)(AJP)required the use of predictive coding, which is does not do. The second is that a different case, Kleen Products LLC v. Packaging Corporation of America, et al., still in court, does hinge on the question of requiring predictive coding.

Indeed, in the particular case to which Peck refers, the litigants agreed between themselves to use predictive coding in principle — but have been unable to agree on the details, and in fact the plaintiffs have filed an objection to Peck’s ruling, saying they are concerned that the software process is not transparent enough.

Peck’s opinion is not a surprise; last October, he wrote an article describing predictive coding and its role in e-discovery. While he uses charming phrases such as “A basic problem is that absent cooperation, the way most lawyers engage in keyword searches is, as Ralph Losey suggests, the equivalent of “Go Fish,””, one hopes he is a better judge than a prophet:

Perhaps they are looking for an opinion concluding that: “It is the opinion of this court that the use of predictive coding is a proper and acceptable means of conducting searches under the Federal Rules of Civil Procedure, and furthermore that the software provided for this purpose by [insert name of your favorite vendor] is the software of choice in this court.” If so, it will be a long wait.

Four months isn’t all that long.

Needless to say, e-discovery vendors are kvelling about the ruling, and not just because Peck uses charming phrases such as, “The Court recognizes that computer-assisted review is not a magic, Staples-Easy-Button, solution appropriate for all cases.” (Peck emphasizes that he isn’t endorsing any particular vendor.)

Clearwell, for example — recently purchased by Symantec (which had specified growth in technology-assisted review as one of its 2012 predictions) as one of the first e-discovery acquisition dominoes to fall — noted five major points about the decision:

  • The Court did not order the use of predictive coding
  • Computer-assisted review is not required in all cases
  • The opinion should not be considered an endorsement of any particular vendors or tools
  • Predictive coding technology can still be expensive
  • Process and methodology are as important as the technology utilized

Organizations that have held off on implementing predictive coding now have a green light to proceed.

February 27, 2012  10:37 PM

Facebook Starts Designing Its Own Storage

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

Remember when Facebook started designing its own servers and data center?

Now it’s designing its own disk drives.

This is all supposed to be part of the company’s Open Compute initiative, according to Wired, though it’s not yet included on the website, and details were thin. (For example, it isn’t clear whether they include the hard drive thermostat the project described last summer.) However, the company said it will release its new storage designs in early May at the next Open Compute Summit.

Facebook is doing all this because it has such a heavy load — 845 million users and 140 billion digital photographs, Wired said — so savings that it can achieve in hardware, whether in the hardware itself, the power it uses, or the cooling it requires, can aggregate to quite a lot. The company has already made a number of changes to its servers to save cost, space, and heat.

For example, in its Prineville, Ore., data center, the company has eliminated chillers and uninterruptible power supplies, Wired said. The article quoted a Facebook engineer, originally from Dell, as saying that the really valuable part of storage is the disk drive itself and the software that controls how the data gets distributed to and recovered from those drives, and that the company would do what it could to eliminate the other ancillary parts, as well as make the valuable parts easier to get at and fix. For example, the company would like to eliminate the handles and screws that are currently part of some disk drives.

So why does this matter to you? Because Facebook intends to open source the storage design when it’s finished, meaning it could end up in the marketplace, as it has with its servers. So chances are, what Facebook decides will affect your data center, too.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: