Our story on Google’s storage assistance to academic and research institutions focused on the Archimedes Palimpsest, but this article in Wired has some interesting further info on the Hubble telescope project, which was also mentioned but not interviewed for our piece.
How do you get 120 terabytes of data — the equivalent of 123,000 iPod shuffles (roughly 30 million songs) — from A to B? For the most part, the old-fashioned way: via a sneakernet. It’s not glamorous, but Google engineers hope to at least end the arduous process of transferring massive quantities of data — which can literally take weeks to upload onto the internet — with something affectionately called “FedExNet” by the scientists who use it…The near totality of all the astronomical data and images that Hubble has ever collected [is] about 120 terabytes.
Do also check out the glamour shot of Google’s open source program manager Chris DiBona posted with the article–we reckon we’ve never seen such a creative executive headshot.
In this hilarious post over at StorageMojo.com, an EMC lawyer issues a “cease and desist” order over the recent publication on the site of an EMC price list, calling it a “trade secret.” He uses some ominous language indeed in his missive, which is reprinted in full by StorageMojo blogger Robin Harris.
Now that you know the facts of the matter I expect an email from you confirming that you have examined the links and documents provided above and that you now understand that EMC’s price list is not a trade secret, despite what you were led to believe by the person who referred StorageMojo.com to you.
Also, you might want to consult with EMC’s public relations and analyst relations groups as to the advisability of continuing to press confidentiality claims against StorageMojo. The internet community – StorageMojo.com had over 100,000 visitors last month – does not take kindly to attempts to limit the free flow of information and First Amendment rights.
We have to say we also got a chuckle out of the Obi-Wan Kenobi reference.
The Alaskan Department of Revenue has just learned the hard way that your backups are only as good as your restores.
A report by AP that was picked up Woonsocket Call, a local paper in Rhode Island, says the department wiped out a disk drive containing information on an account worth $38 billion. But worse still, when it turned to the backup tapes to recover the data, it was unreadable.
P.S: Not every backup issue can be avoided but there are resources to help. Here’s one if you have time: a free seminar that’s coming to a city near you this year.
Jonathan Schwartz has an interesting post up right now that calculates the relative transfer power of Internet networks vs. a sailboat. The sailboat wins.
“Now you understand why tape based storage has such a lasting appeal to so many enterprises recording, compiling, transporting or just plain archiving, very large quantities of data. From video surveillance to trading data. Standard tapes are 500GB each (currently), and fit nicely into cardboard boxes with overnight express labels[...]tape isn’t perfect for a lot of applications (near line storage, eg) – but it plays a prominent role in some remarkably cutting edge high performance computing applications, along with social networking and content aggregation sites (who think nothing of gathering terabytes of data every day) – tape archive isn’t just for banks or telcos running mainframes (although we’re good there, too).”
Er…we’re thinking maybe ixnay on the “cardboard box with overnight express labels” part, but Sun incidentally has at least one large customer announced to back this up.
Meanwhile, Schwartz’s commenters also raise some good counterpoints on the post. One supporting commenter also linked to an article about Jim Gray, founder of the Terraserver project and perhaps the biggest proponent of station-wagon data migration. Unless, of course, it’s Google, which is also biting the bandwidth bullet for some users in heavy-duty academic research.
Seagate is finally shipping its self-encrypting laptop drive, the Momentus 5400 FDE.2. We first covered the plans around these drives in July 2005, and covered it again last October, when Seagate’s big SNW announcement was that Momentus would ship…in another few months.
Now, just two weeks under the wire of its promised ship date in first quarter 07, we finally have Momentus.
We’ve asked what’s taken so long, but Seagate isn’t talking.
Way back in 2005, when stegasauruses wandered the Earth, HP had yet to pretext and hundreds of fresh data breaches had yet to be reported around the world, Seagate told us they were working on bringing encryption to enterprise drives:
“We feel we have developed a technology that could be applied broadly,” said Mark Pastor, strategic marketing director for Seagate. “We see a lot of resonance in the enterprise space, because there’s a lot of confidential data out there at the enterprise level. This is a good and efficient way of accomplishing the task of encrypting data on drives.”
“You will see FDE [Full Disc Encryption] and other security capabilities and others on enterprise and other products from Seagate, across the spectrum,” according [Seagate's executive director of global product marketing Henry] Fabian.
There’s still no sign of anything resembling an enterprise drive-level product, and after the wait for Momentus, we aren’t holding our breath.
An Amazon spokesperson sent us this email in response to our story, Users rethink Amazon S3 after performance issues:
“If you call this story balanced, then I was misled by your reporter. She only reported on companies with a negative experience that was only “balanced” by a response from Amazon to this one type of experience. If you want to balance the story, then she should write the second half of the story covering companies with positive experiences.”
Finding the ideal balance on each and every story, at short notice, is always a challenge. We called both users that Amazon provided. One was SmugMug.com, quoted in the story. The second was Jungle Disk. This company has still not returned our calls or emails. We found a third user, Mochi Media, quoted in the story. These responses, plus Amazon’s reply…
“We’ve had a few problems over the past year and each time we learned something and instituted a new process or safeguard to prevent the problem from happening in the future.”
…was the story we were able to write under the time constraints of daily news.
But should we be aiming for a perfect balance anyway? Yes and no. News is a snapshot in time of what has happened over the last 24 hours. To us, the important part is that over time, possibly over several stories, we have an accurate reflection of what Amazon S3 users think. We have requested more S3 customers to talk with, and look forward to hearing their experiences…
Silicon Valley Watcher and its commenters have an interesting reaction to IDC’s report yesterday (commissioned by EMC) that we’re generating more data than we can store.
SVW blogger Tom Foremski writes:
How is it that we would be able to generate almost 1 zettabyte of data in the first place–without having a place to store it…?
Surely, if we can generate it, we are able to store it, because data comes to us from data storage systems…
Is IDC talking about data that we might like to store but we won’t be able to store?
Then that figure is meaningless, because there is no end of data we might want to capture and store in the future. And there is no end of these type of useless market research forecasts, imho.
Commenter Roger Bohn adds:
The IDC conclusion that “we will produce more data than we can store” is poorly explained in their report. What they mean is that the ANNUAL data production will be greater than the CUMULATIVE storage. Not a big deal: much data is stored only for days to months, if at all. Example: email, surveilance videos, Bittorrent downloads. So, there is no inherent reason why the two numbers should be directly comparable.
Meanwhile, EMC blogger Chuck Hollis says this validates his previous theory of a digital “big bang”…Hollis and EMC have an obvious vested interest here, but what we find most interesting is Hollis’s discussion of the issue of “who owns information”, as well as long-term archiving.
Speaking of who owns information, it’s not directly storage-related, but anytime Microsoft starts doing battle with Google, you just have to make some popcorn and sit back to watch…
The friendly folks at Zantaz Inc. wanted to make sure that we were aware of the news that Intel now stands accused of deleting emails material to ongoing litigation between it and rival chip maker AMD. Quoth Zantaz in an email, “this is a prime example of what NOT TO DO’ in terms of email retention policies.” No kidding.
EMC has sponsored a report by IDC, out today, on the exploding growth of data. It attempts to perform a census of the amount of digital information created/copied in the world and project its growth rates. Here it is:
Here’s a taste of some of the numbers:
- In 2006, the amount of digital information created and copied worldwide was equal to 161 billion gigabytes, or 161 exabytes. In layman’s terms, that figure is roughly equivalent to three million times the information in all the books ever written – or the equivalent of 12 stacks of books, each extending more than 93 million miles from the earth to the sun.
- In 2006, if divided evenly across the global population — currently 6.6 billion (6,579,247,264) people — approximately 24 gigabytes of digital information was created per person.
- In 2010 alone, the amount of digital information created and copied worldwide will rise six fold to 988 exabytes . The unprecedented nature of this growth is symbolized by the fact that the word “exabyte” doesn’t exist in any word processing program’s spell checker.
- The book and gigabyte analogies above will jump in 2010 to approximately 150 gigabytes per person and approximately 75 stacks of books to the sun.
- By 2010, nearly 70% of digital information will be created by individuals; however, organizations will be responsible for the security, privacy, reliability, and compliance of at least 85% of the digital universe.
Gosh, could this mean we need to buy some more storage? Joking aside, the numbers are staggering. I’d like to know more about how IDC gathered this information.
McAfee has announced that Dave DeWalt, EMC Corp.’s current head of customer relations, will become its new CEO, effective April 2. DeWalt is replacing George Samenuk, who was forced to step down in October because of a stock options investigation that means McAfee has had to restate 10 years worth of results. “I have to restore the image of McAfee,” DeWalt is quoted as saying about the transition.
The move is reminiscent of former EMC CTO George Symons’ move to Yosemite Technologies in October. We suppose it’s only natural for execs who have come up the ranks at a big company like EMC to branch out later in their careers, but we also wonder if it’s coincidence that Symons’ defection to a separate SMB backup company preceded the apparent death of EMC Retrospect by just a few months.
Meanwhile, MarketWatch reports that Symantec execs are promising more acquisitions to come. Enrique Salem, Symantec’s group president of consumer products and solutions, wouldn’t even rule out another billion-dollar deal at the Morgan Stanley Technology Conference in San Francisco. Wondering how Symantec customers feel about that…
Anyway, signing off for now–someone from a big company could be spying on us, after all.