October 6, 2011 11:37 PM
Posted by: Sharon Fisher
IBM announced this week that it had been selected for a 10-year $240 million operations and maintenance contract with the National Archives and Records Administration, but there’s a lot more to the story than that. IBM is actually taking over from Lockheed Martin after several years of a project that’s fallen behind schedule and over budget.
The project is to manage the Electronic Records Archive, and is intended to ensure the transparency of government documents, allowing broader citizen access to public records. The project was started in 2001 to preserve and provide both internal and external electronic access to the records. But it had its problems, noted Elizabeth Montalbano of Information Week:
NARA began working on the digital archive in 2001 and in 2005 awarded Lockheed Martin a $317 million contract to develop it. However, the project has not been without its troubles along the way. Earlier this year a report by the Government Accountability Office found that the project likely will cost $1.2 billion to $1.4 billion, exceeding its estimated cost of $995 million by 21% to 41%. The report cited poor project management as the reason for the soaring costs.”
In fact, due to its inclusion on in the GAO report, NARA cut some of the functionality from the project in February and decided to do no new development past September, which is what enabled IBM to get an O&M contract after the contract with Lockheed ended on September 30, the end of the federal fiscal year — about a year earlier than planned. Originally, NARA had had a sixth option year on the Lockheed Martin deal for development, and a seventh year for operations and maintenance, FederalNewsRadio.com reported.
According to the IT Dashboard
, NARA has spent $383 million
, which is $5.6 million more than it planned, as of Aug. 3, noted FederalNewsRadio.com. To add insult to injury, a session with the Office of Management and Budget (OMB) showed that, after five years of development, few agencies were using the system’s functionalities, resulting in a reduction of its budget by
$215.5 million and being required to deliver functionality faster, to increase the usage of the system from 80 terabytes to 122 terabytes, and to move to a modular development approach.
The project was officially launched in April, particularly with what were called three “pathfinder” agencies, so-called because of the amount of requests those agencies received: Justice, Health & Human Services, and State. 27 other agencies were supposed to start bringing their records online by the end of November, while independent agencies were supposed to start bringing their records online in July, FederalNewsRadio.com noted.
But IBM’s role will be more than just maintenance and operations. An agency spokesman said that IBM would be adding functionality to the system through a series of work orders and other enhancements — in particular, improving the search system, the spokesman said.
September 30, 2011 11:08 PM
Posted by: Sharon Fisher
One of the most interesting aspects about the announcement this week that EMC CEO Joe Tucci was planning to step down by the end of next year was how blase’ everyone was about it. He wasn’t fired. He isn’t dying (so far as we know, existential aren’t-we-all-dying questions aside). He’s not part of a parade of CEOs who have come and gone. It’s just, hey, next year I’ll be 65, time to go.
Part of this, of course, is in contrast to other CEO departures this year where people were fired, dying, part of a parade, and so on. Compared to, say, HP, Apple, or HP again, respectively, the notion of a guy who become CEO ten years ago, did his job, and is leaving at a normal retirement age seems almost quaint.
Part of this, too, is the company culture. EMC may be one of the biggest storage companies out there, but it’s not a rock star consumer-driven company the way Apple is. It’s normal there for the succession to be a relatively gentlemanly affair. Tucci did his time before he became CEO, serving under the previous CEO as executive chair for two years, and will serve as executive chair for the next EMC CEO, whomever he may be (nobody’s suggesting that the next CEO of EMC might be female).
Part of it is also the lack of drama around the succession. Yes, it’s true, nobody was named as the next CEO yet, and of course there’s always the potential of a bunch of little storage Borgias backstabbing and poisoning each other. But EMC is the sort of company where people use the term “deep bench” a lot. Most articles around Tucci’s announcement (which he made to the Wall Street Journal, naturally) named at least four potential successors, any one of whom would be qualified to run the company. Nobody’s wringing their hands suggesting that EMC will have to go outside the company to find someone qualified.
Part of it is that even with his more than one-year notice, this isn’t a surprise; Tucci started talking about succession a year ago — with the same four guys as potential successors. (And nobody’s trying to out any of them, as people are doing with Apple’s Tim Cook.)
The biggest problem cited in the very few articles around the announcement — there’s more articles about the fact that Tucci is going to be speaking at an Oracle conference next week than there are about his retirement — is whether he should continue to stay after stepping down, which the Wall Street Journal started by including EMC in a list of companies where CEOs stay on as executive chairs.
The Motley Fool is trying to beat the drum for a shareholder revolt against the fact that the next EMC CEO will be continue to be both CEO and chairman, but they’re pretty alone in that.
At this point, about all we can do is wait to see who gets appointed the next EMC CEO — and there’s no timetable for that yet.
September 26, 2011 11:15 PM
Posted by: Sharon Fisher
The Electronic Frontier Foundation has announced that two vendors, Apple and Dropbox, have signed a pledge to help support its Digital Due Process initiative, which calls for a rewrite of the Electronic Communications Privacy Act to better protect user data.
The initiative has more than 50 members, including Amazon, AT&T, Facebook, Google, Microsoft, Twitter, and Yahoo!, which were called out in April as being major computer vendors that should support the proposal. Steps included in the proposal include telling users about data demands, being transparent about government requests, fighting for user privacy in the courts, and fighting for user privacy in Congress. Companies received from one to four stars (including partial stars) depending on how well they are implementing each of these policies.
Dropbox was a particularly interesting addition, because the company has been criticized about its policies regarding protecting user data in its cloud storage service.
Other vendors pf the 13 that the EFF called out in April that have not yet responded include Comcast, Myspace, Skype (since purchased by Microsoft, which is a member), and Verizon.
Organizations such as the American Civil Liberties Union and the Center for Democracy & Technology are also members.
September 20, 2011 1:30 PM
Posted by: Sharon Fisher
It’s typically a good idea to take vendor surveys with a grain of salt; they tend to be slanted and unscientific. Not so with Symantec; they have actual scientific surveys with margins of error and everything.
Not to say, of course, that they’re completely unbiased; recall in this case that Symantec purchased Clearwell earlier this year in an attempt to improve its ranking after a recent Gartner Magic Quadrant on eDiscovery vendors.
That said, its Information Retention and eDiscovery Survey has some interesting points to be made — not the least of which is actual evidence from users that implementing an information retention policy saves money.
- Respondents using best practices reported a 64% faster response time with a 2.3 times higher success rate when responding to eDiscovery requests.
- They were 78% less likely to be sanctioned by the courts and 47% less likely to find themselves in a compromised legal position.
- They were also 20% less likely to have fines levied against them. In addition, they were 45% less likely to disclose too much information.
That said, many respondents indicated that they had not yet implemented an information retention plan.
- Nearly half of respondents do not have an information retention plan in place.
- 30% are only discussing how to do so.
- 14% have no plan to do so.
- When asked why they don’t have information retention programs, respondents indicated the top reasons are: lack of need (41%), too costly (38%); nobody has been chartered with that responsibility (27%); don’t have time (26%); and lack of expertise (21%).
The part about “too costly” is particularly telling in light of the results.
Respondents who said they’d been asked to respond to a legal, compliance or regulatory request for electronically stored information reported the following results:
- Completely failed to fulfill the request 10%
- Partially failed to fulfill the request 10%
- Successfully fulfilled the request, but more slowly than the requestor would like 25%
- Successfully fulfilled the request in a timeframe that is acceptable to the requestor 35%
How this correlated with whether organizations had an information retention strategy in place, Symantec didn’t say.
Finally, in situations where an organization did not successfully fulfill the request in an acceptable timeframe, respondents reported the following results:
- Damage to Enterprise reputation or embarrassment 42%
- Fines 41%
- Compromised legal position 38%
- Sanctions by courts 28%
- Hampered our ability to make decisions in a timely fashion 26%
- Raised our profile as a potential litigation target 25%
Again, not clear how these correlated with the different types of organizations, but useful to have some specific information about results and sanctions.
September 14, 2011 12:25 PM
Posted by: Sharon Fisher
, station wagons
The thing is, it’s true. Even though Internet speeds continue to increase, the amount of data we want to transmit continues to increase, too.
Which is why the various Internet denizens have developed….workarounds for large file transfers, which also provides the opportunity for the wonderful Internet pastime of geekly arguing.
Which brings us to station wagons, pigeons, and Blu-ray.
The canonical statement, by Andrew Tannenbaum in his 1996 book Computer Networks, is basically “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.” And ever since then, there have been numerous websites devoted to how-many-angels-can-dance-on-the-head-of-a-pin discussions about just what that bandwidth would be.
You can tell how old the websites are based on what figures they use for comparable Internet bandwidth, the size of a magnetic tape, and so on. The Wikipedia entry for “Sneakernet” appears to have the most up-to-date calculations.
(The actual calculation using today’s technologies is left as an exercise for the reader.)
The Internet being the Internet, the calculations have been extended, ranging from petabytes in a sailboat to Blu-ray discs in a 747 (which, as it turns out, would actually be too heavy for a 747 to carry), to, more mundanely, the number of SD cards that fit into a Fed Ex box — as well as the bandwidth of a Netflix movie shipment through the mail.
And then there’s the pigeons.
Really truly, carrier pigeons have been used for a remarkable amount of data transfer in history — not just short messages, and aerial photography predating satellites, but things like blueprints from military installations in the U.S.
In fact, in 1982, Computerworld ran an article about how Lockheed Missile & Space Co. used pigeons to carry microfilm copies of blueprints to a research facility in Santa Cruz, because it was cheaper than printing out and transporting hard copies. And if you have $100 per half hour for someone to dig it up, you can apparently get a copy of Dan Rather introducing a story about it on CBS News.
Consequently, not one but two April Fool’s Internet protocols were developed — Transmission of IP Datagrams on Avian Carriers, and Transmission of IP Datagrams on Avian Carriers with Quality Control — for transmitting Internet data by carrier pigeon. The first one was even demonstrated, and while the experiment left something to be desired, Wikipedia points out that “during the last 20 years, the information density of storage media and thus the bandwidth of an Avian Carrier has increased 3 times faster than the bandwidth of the Internet.”
That’s not all. In various remote areas, such as rural U.K., Australia, and parts of South Africa, people have used carrier pigeons to demonstrate that they’re faster than what passes for high-speed Internet there.
The point is this: No matter how fat a pipe you have to the Internet, at some given amount of data, it’s going to be faster, cheaper, or both to use some manual method to ship data on some storage medium. It makes sense for you to do a back-of-the-envelope calculation to figure out where the data boundaries are for different mediums and different shipping methods, and update them as technology changes.
September 7, 2011 4:46 PM
Posted by: Sharon Fisher
, lto 5
, lto 6
, lto 7
, lto 8
Tape’s not dead. Really. Products supporting the Linear Tape Open (LTO) 5 specification just began shipping this year, but already vendors are starting to make noises about LTO 6, for which there isn’t even an availability date announced yet.
In sort of the tape storage equivalent to Moore’s Law, a consortium of three vendors — Hewlett Packard, IBM, and Quantum, known as the Technology Provider Companies (TPC) — get together every few years and decide upon specifications for tape cartridges with a steady increase in speed and capacity. This helps keep users convinced that there’s still a future for tape.
For example, the specifications for LTO 5 (as well as LTO 6) were announced in December 2004, but it took until January 2010 before licenses for the LTO 5 specification was available, and products supporting it started to be available in the second quarter of that year.
Similarly, the LTO TPCs announced in June of this year that licenses for the LTO 6 specification were available. By extrapolation, one can assume that LTO 6 products could be announced any day.
LTO 6 is defined as having a capacity of 8 TB with a data transfer speed of up to 525 MB/s, assuming a 2.5:1 compression. This is in comparison to LTO 5, which has a capacity of 3 TB with a data transfer speed of up to 280 MB/s, assuming a 2.5:1 compression.
Lest people get fidgety about the future of tape after that, the LTO TPC announced this spring the next two generations, LTO 7 and LTO 8, with compressed capacities of 16 TB and 32 TB and data transfers speeds of 788 MB/s and 1180 MB/s, respectively. As with LTO 6, no dates were announced, but one might expect each will come out about two to three years in succession.
The thing to remember, also, is that each LTO generation can typically only read two generations before it — meaning users needs to either rewrite their tape library every few years or keep a bunch of old LTO machines around. “By the time LTO 8 is released, organizations will need, at a minimum, LTO 3 drives to read LTO 1 through LTO 3 cartridges; LTO 6 drives to read LTO 4 through LTO 6 cartridges; and LTO 8 drives to read the LTO 7 and LTO 8 cartridges,” wrote Graeme Elliott earlier this year.
August 31, 2011 2:46 PM
Posted by: Sharon Fisher
The best part about IBM’s experimental 120-petabyte hard drive is reading all the ways that writers try to explain how big it is.
- 2.4 million Blu-ray disks
- 24 million HD movies
- 24 billion MP3s
- 1 trillion files
- Eight times as largest as the biggest disk array available previously
- More than twice the entire written works of mankind from the beginning of recorded history in all languages
- 6,000 Libraries of Congress (a standard unit of data measure)
- Almost as much data as Google processes every week
- Or, four Facebooks
It is not one humungo drive; it is, in fact, an array of 200,000 conventional hard drives (not even solid-state disk) hooked together (which would make them an average of 600 GB each).
Unfortunately, you’re not going to be able to trundle down to Fry’s and get one anytime soon. No, this is something being put together by the IBM Almaden research lab in San Jose, Calif., according to MIT Technology Review.
What exactly it’s going to be used for IBM wouldn’t say, only that it was “an unnamed client that needs a new supercomputer for detailed simulations of real-world phenomena.” Most writers speculated that that meant weather, though Popular Science thought it could be used for seismic monitoring — or by the NSA for spying on people.
Like the Cray supercomputer back in the day, and some high-powered PCs even now, the system is reportedly water-cooled rather than by using fans.
Needless to say, it also uses a different file system than a typical PC: IBM’s General Parallel File System (GPFS), which according to Wikipedia has been available on GPFS has been available on IBM’s AIX since 1998, on Linux since 2001 and on Microsoft Windows Server since 2008 and which some tests have shown can work up to 37 times faster than a typical system. (The Wikipedia entry also has an interesting comparison with the file system used by big data provider Hadoop.)
GPFS provides higher input/output performance by “striping” blocks of data from individual files over multiple disks, and reading and writing these blocks in parallel.”
The system also has a kind of super-mondo RAID that lets dying disks store copies of themselves and then get replaced, which reportedly gives the system a mean time between failure of a million years.
Technology Review didn’t say how much space it took up, but if a typical drive is, say, 4 in. x 5.75 in. x 1 in, we’re talking 4.6 million cubic inches just for the drives themselves, not counting the cooling system and cables and so on. That’s a 20-ft. x 20-ft. square almost 7.5 feet high, just of drives. (This is all back-of-the-envelope calculations.)
In fact, the system needs two petabytes of its storage just to keep track of all the index files and metadata, Technology Review reported.
August 24, 2011 11:16 PM
Posted by: Sharon Fisher
In the winter, I keep my thermostat set to a particular temperature. When I leave the house, or go to bed, I turn the thermostat down, and when I get home or wake up, I turn it back up. This ensures that the house is comfortable when I’m using it, and more energy-efficient when I’m not.
Now, someone is talking about doing the same thing for hard disk drives.
Eran Tal, a hardware engineer at Facebook, is talking about the idea. In case you didn’t know, Facebook has some of the largest data centers in the world, and has begun publicizing some details of their design to help other data center managers leverage what Facebook has learned in the process.
Consequently, earlier this year, Facebook created when it called the Open Compute Project, which is, essentially, to hardware design what open source is to software design. Thus far, the site’s blog has a grand total of two postings, along with a number of comments on them.
And that’s where Tal comes in. A few days ago, he made one of those two posts, musing about what it would be like to have hard disks with a toggle switch between low speed and high speed, so that as the data on them became older and less actively used, the switch could be toggled to put the hard disks on a lower speed — saving energy in the process, without having to do the data migration that active tiering requires.
Reducing HDD RPM by half would save roughly 3-5W per HDD. Data centers today can have up to tens and even hundreds of thousands of cold drives, so the power savings impact at the data center level can be quite significant, on the order of hundreds of kilowatts, maybe even a megawatt. The reduced HDD bandwidth due to lower RPM would likely still be more than sufficient for most cold use cases, as a data rate of several (perhaps several dozen) MBs should still be possible. In most cases a user is requesting less than a few MBs of data, meaning that they will likely not notice the added service time for their request due to the reduced speed HDDs.
Once upon a time — seven whole years ago — there was a vendor that did something like this: Copan, with what it called its Massive Array of Idle Disk (MAID) technology, produced disk drives where only up to 25% of them were on at a time. Unfortunately, after getting new funding as recently as February 2009, Copan declared bankruptcy in 2010 and was bought by SGI (yes, it’s still around), which still markets the technology, after a fashion at least.
Several other vendors, including Nexsan with its AutoMAID technology, also have products in this area.
The big trick with any of these systems is ensuring that the data on them really isn’t used very much, because it can take up to 30 seconds for the disk to start from zero, and up to 15 seconds from the slower speed. But as Derrick Harris of GigaOm writes, the savings for a data center the size of Facebook’s can be considerable, and the technology could end up trickling down in the process.
August 18, 2011 9:20 PM
Posted by: Sharon Fisher
Another e-Discovery vendor has been purchased: Hewlett-Packard has announced its intent to purchase UK vendor Autonomy — which, like Symantec purchasee Clearwell earlier this year, was also in the Leaders section of Gartner’s e-Discovery Magic Quadrant released in May.
In that report, Gartner predicted that consolidation would have eliminated one in four enterprise e-Discovery vendors by 2014, with the acquirers likely to be mainstream companies such as Hewlett-Packard, Oracle, Microsoft, and storage vendors. Autonomy itself acquired Iron Mountain’s archiving, e-discovery and online backup business in May for US$ 380 million in cash.
HP offered the US equivalent of $42.11 per share for Autonomy, which it said was a 64% premium over the one-day stock price and a 58% premium over the one-month average stock price. The overall price is on the order of $10 billion.
Autonomy is a brand and marketing powerhouse that appears on many clients’ shortlists,” Gartner said in its earlier report. “Although we have seen little appetite for ‘full-service e-discovery platforms’ from clients as yet, Autonomy is positioned to seize these opportunities when they do arise — indeed, the overall market may evolve in that direction.”
HP’s chief executive officer, Leo Apotheker, formerly of SAP, has said he wants to focus on higher-margin businesses such as software and de-emphasize the personal computer business, said the New York Times. The company also said it is eliminating its WebOS business and is reportedly considering spinning off its PC business, just a decade after acquiring major PC vendor Compaq.
The AP, in fact, went so far as to say
[T]he decision to buy Autonomy also marks a change of course for HP, one that makes HP’s trajectory look remarkably similar to rival IBM’s nearly a decade ago. IBM, a key player in building the PC market in the 1980s, sold its PC business in 2004 to focus on software and services, which aren’t as labor- or component-intensive as building computer hardware.”
However, such a transition may not be easy, said an article in the Wall Street Journal, which examined how IBM had made that transition.
The Autonomy deal offered another advantage to HP, noted a different New York Times article. Like Microsoft’s purchase of Skype earlier this year, it gives HP the opportunity to spend money it had earned outside the U.S. — reportedly as much as $12 billion — without having to pay taxes on that money by bringing it into the U.S.
Other e-Discovery vendors include FTI Technology, Guidance Software, and kCura, the remaining vendors in the “Leaders” section in the Gartner Magic Quadrant. Less attractive, but also likely to be less expensive and, maybe, more desperate, will be the other vendors, such as AccessData Group, CaseCentral, Catalyst Repository Systems, CommVault, Exterro, Recommind and ZyLab in the “visionaries” quadrants, and Daegis, Epiq Systems, Integreon, Ipro, Kroll Ontrack, as well as the ediscovery components of Lexis/Nexis and Xerox Litigation Services in the “niche” quadrant.