Jesse at SanGod wrote an interesting post the other day entitled “Enterprise storage vs….not.”
I have a cousin. Very well-to-do man, owns a company that does something with storing and providing stock data to other users. I don’t pretent do know the details of the business, but what I do know is that it’s storage and bandwidth intensive.
He’s building his infrastructure on a home-grown storage solution – Tyan motherboards, Areca SATA controllers, infiniband back-end, etc. Probably screaming fast but I don’t have any hard-numbers on what kind of performance he’s getting.
Now I understand people like me not wanting to invest a quarter-mil on “enterprise-class” storage, but why would someone who’se complete and total livelihood depends on their storage infrastructure rely on an open-source, unsupported architecture?
Jesse goes on to point out the resiliency and services benefits of shelling out enterprise bucks. His post sparked a conversation between me and an end user I know well whose shops (in the two jobs I’ve followed him through) are as enterprise as they come. This guy also knows his way around a Symmetrix, and Clariion, and NetApp filers, and when it comes to the secondary disk storage and media servers he’s building for his beefy Symantec NetBackup environment…he’s going with Sun’s Thumper-based open-source storage.
Obviously it’s a little different from cobbling together the raw parts, and Sun offers support on this, so it’s kind of apples and oranges compared with what Jesse’s talking about. But I’ve also heard similar withering talk about Sun’s open storage in particular, and can only imagine Sun’s open-source push is making this topic timely.
This is the second person I’ve talked to from a big, booming enterprise shop who picked Thumper to support NetBackup. The first, who had the idea more than a year ago, was a backup admin from a major telco I met at Symantec Vision.
Obviously it’s not mission-critical storage in the sense that Symmetrix or XP or USP are, but I’d venture to guess that for a backup admin, his “complete and total livelihood” does depend on this open-source storage. As for the reasons to deploy it instead of a NetApp SATA array or EMC Disk Library or Diligent VTL? Both users cited cost, and the one I talked to more recently had some pointed things to say about what enterprise-class support often really means (see also the Compellent customer I talked with last week, who found that the dollars he spent made him less appreciative of the support he got from EMC).
This ties in with a recent conversation I had with StorageMojo’s Robin Harris. He compares what’s happening in storage to the relationship between massively parallel systems and the PC in the era of the minicomputer. When the PC arrived, the workstation market was dominated by makers of minicomputers, the most famous being Digital. Minicomputers were proprietary, expensive and vertically integrated with apps by vendors, much like today’s storage subsystems. Just as the PC introduced a low-cost, industry-standard workstation and the concept of a standardized OS, Harris predicts clustered NAS products built on lower cost, industry-standard components will bring about a similar paradigm shift in enterprise storage.
While there will obviously remain use cases for all kinds of storage (after all, people still run mainframes), I suspect people are starting to think differently about what they’re willing to pay for storage subsystems in the enterprise, regardless of the support or capabilities they’d get for the extra cash. And I do think that on several fronts, whether open-source storage or clustered NAS, it is looking, as Harris put it, like the beginnings of a paradigm shift similar to those that have already happened with PCs and servers.
That’s not to say I think Sun will win out, though. For all Sun’s talk about the brave new world of open-source storage, I haven’t heard much emphasis placed on the secondary-storage use case for it. And that so far is the only type of enterprise deployment for Thumper I’ve come across in the real world.
I’ve watched the story unfold about Microsoft and Yahoo, but from a removed perspective because it has little to do with the storage industry and when it comes to most things Web-based and search or email related, I’m a Google user. Still, it’s been a good story to sit back with some popcorn and watch develop.
Recently, though, it’s hit home a little more for me. First, I saw that the New York Times/AP reported that the co-founders of Flickr, a photo sharing service bought by Yahoo in 2005, have left the company. Then I found out that the founder of Del.icio.us is also leaving Yahoo–which was the first time I even realized Del.icio.us was a Yahoo property.
Now I wonder two things–1) How many other staples of my Web 2.o life are part of Yahoo and I didn’t know it? (One helpful resource for this question: TechCrunch has posted a big table to keep track of the Yahoo exodus). 2) What’s going to happen to them?
It’s as close as I’ll ever come to the experience my enterprise storage audience must have regularly when dealing with the effects of mergers and acquisitions. Anxiety frequently accompanies these events, causing people to wonder how the user experience will change with the product, how support might change, how well might the company keep up with features…
It’s not like products can’t survive without their original innovators, and for the moment, Yahoo does still exist as we know it (though there’s speculation that’s not for long). But I have seen in the storage industry how innovation diminishes after the guys who first built the machine in the garage leave the company, innovation diminishes, and the company itself is more likely to move on to the next shiny object.
That’s what I’m afraid will happen now to Flickr and Del.icio.us, and then I’d have to face another nightmare commong among enterprise folks–how to get my 8,000-plus photos and 2,000-plus bookmarks migrated over to another service.
I wasn’t convinced at first when an alert blog reader flagged an error in my previous posts about Symantec and SwapDrive: a comment from “kataar” pointed out that yearly, SwapDrive actually charges $500 (five hundred) for 2 GB, not $50 (fifty).
That couldn’t possibly be right, I thought. I clicked the site, saw the same price list, read down the column for individual users–ah! 2 GB, $50. I was all ready to post a reply when I went back and checked it one more time, just to be sure. That’s when I noticed “Monthly” over the cost I was looking at. Under “Yearly” was, indeed, $500. For 2 GB of storage per year. For multi-user plans of up to 10 GB, the yearly cost is $2,800.
My bad. And thanks to kataar!
EMC, of course, is having a field day with this. Even comparing a relatively modest price of $49.50 a year (you’ll notice Mark Twomey made the same mistake I did), they are only too happy point out that you can get 2 GB of storage free from Mozy (I’ll let the irony of EMC gloating about another vendor’s pricing pass for now). Meanwhile, you can get up to 5 GB free from Windows SkyDrive, GMail will give you a 2 GB inbox for free, and Carbonite will let you back up unlimited capacity to its cloud for $49.95 per year.
I’ve heard of some of the older data hosting services, like certain specialized deals with Iron Mountain, costing in the neighborhood of SwapDrive’s quoted price, but I haven’t heard of too many in the consumer/SOHO/SMB space charging on that scale.
When I asked Symantec about the pricing, this was the response: “SwapDrive’s current online pricing will keep pace with the market and the value derived. Our service is more robust and redundant than many others offered in the market today.” The spokesperson added that 2 GB of online storage comes included with Norton 360 for an MSRP of $79.99.
I’d really like to learn more about exactly what makes SwapDrive hundreds of dollars more robust and redundant per year. And what makes it worth $500 standalone but worth some percentage of $80 with Norton 360? That seems like a big swing to me.
Tory Skyers’ post about dedupe and the law jogged my memory about recent conversations I’ve had with users about data compliance and archiving. It’s become a big topic for this industry, and as stewards of data, storage managers are part of the legal e-Discovery process.
But some storage managers are beginning to draw a line when it comes to the extent of their role in that process. A discussion about compliance only goes so far these days before frustration starts to show. Someone from a municipal government shop I met at Symantec Vision last week extolled the virtues of Symantec’s Enterprise Vault for data retention and said his organization has policies for dealing with litigation. But he was clear that his role in the process involves managing bits on disk, period. “I don’t delete anything without the department that owns it giving me explicit instructions,” he said. “It’s not up to me to decide to delete data–it’s up to me to keep the storage and backups running on whatever data departments want to keep.”
This week I spoke to a storage guy from a hospital about email management and archiving, and he told me his shop deletes all email after 60 days. “We wrote policies that say we don’t keep email very long because of the storage cost,” he said, and then added that he’d been told by some vendors pushing archiving that a short enough retention period could “make him look guilty.”
“I’m not guilty of anything,” retorted the user. “I’m an IT guy trying to keep email running.”
And he’s right. As long as a company’s retention policy is clearly defined and followed scrupulously, it can be just about any length of time.
As everybody and their uncle tries to get in on selling e-Discovery products and services, new players emerge and the competition gets fiercer. It sounds to me like this is leading some vendors to use scare tactics to push sales by exaggerating how much liability the storage people have when it comes to data compliance and retention. Analysts increasingly agree that organizations of sufficient size should dedicate a liaison between IT and corporate governance to oversee policy instead of tossing legal liability onto the shoulders of IT.
The problem is, IT people remain responsible for understanding and following policies. They also may be called upon to testify as to what those policies are. While I don’t think users should have to take on the legal burden alone, I hope they’re not being pushed too far in the opposite direction, so caught up in shrugging off false expectations that they aren’t mindful of the real ones.
Data deduplication is the poster child of 2008. Everyone is rushing to add this capability to just about everything that could possibly ever sit on a network–I thought I saw an ad for a cable tester with de-dupe built in! On the face of it, de-dupe looks like the savior it’s made out to be (except in very isolated instances where it actually inflates the size of stored data, but that’s another subject for another time.).
But take a look a little deeper with my paranoid, curmudgeon-y, semi-lawyer-esque hat on.
De-dupe technology has been likened to “zip” on the fly (no pun intended), which is where I have a couple of problems while wearing my pseudo-legal hat. The first is the act of compression. Way back in the olden days of computing there was a product appropriately named Stacker; its purpose in life was to allow you to fit more on the ridiculously expensive devices we had in our computer called “hard drives”. Microsoft, not content with Stac backing out of a licensing deal, created DoubleSpace (got sued and lost), then DriveSpace (DOS 6.21).
Via the use of a TSR (even the acronym is dusty), these products would intercept all calls destined for your hard drive and compress the data before it got there. Sound familiar? Those disk compression tools had their run, I used them but it presented problems with memory management, at the time Bill Gates decided no one would ever need more than 640KB, amongst other things. This presented a phenomenally large problem when I would load up one of my favorite games at the time from Spectrum Holobyte: Falcon 3.0, Falcon fans know what sorts of contortions one had to endure to get enough lower memory to run Falcon, but I digress.
So I would try to get around having Stacker or DoubleSpace turned on all the time. That didn’t work out well for me, and I spent quite a bit of time compressing and re-compressing my hard drive, enabling and disabling Stacker and DoubleSpace and setting up various non-compressed partitions.
While I don’t see that specific instance as an issue now per se, I do have that (bad) experience, and because of it I have a problem with something sitting inline with my data, compressing it with a proprietary algorithm that I can’t undo if/when the device decides it doesn’t like me anymore. Jumping back 16 years, it wasn’t that hard to format and reinstall DOS, which was a small part of my (then gigantic) 160MB ESDI hard drive, to get around the problems I had. But today when we are talking about multiple Terabytes and such, I want to be sure that I can get to my data unfettered when I need it.
The reason I am paranoid about getting access to my data when I need it: compliance and legal situations. Which brings me to my second point. How will de-dupe stand up in court? Is it even an issue? Is compression so well understood and accepted that it wouldn’t even be problem? Even as paranoid as I am I would have to say … maybe.
Compression has been around for a very long time, we are used to it, we accept it, and we accept some if its shortcomings (ever try to recover a corrupted zip file?) and its limitations, but will that stand up in court? In today’s digital world there are quite a few things that are being decided in our court systems that may not necessarily make sense. Are we sure our legislators understand the differences between a zip (lossless) and JPEG (lossy) compression? How does the act of compressing affect the validity of the data? Does it affect the metadata or envelope information? The answer to these questions, while second nature for us technology folks, may not so second nature for the people deciding court cases. Because compressing and decompressing data is a physical change to the data itself, I can imagine a lawyer trying to invalidate data based on that fact.
I hope that doesn’t turn out to be the case. The de-dupe products currently on the market have some astounding technology and performance. They also return quite a bit to the bottom line when used as prescribed, and the solid quantifiable return on investment they represent does for most outweigh any risks.
I had a technology demo Tuesday with Xiotech, where they showed off their new baby, the Emprise storage system. A technology demo might seem like a worse fate than death to most, but I appreciate the opportunity to get out from behind my phone and computer screen and actually see things in the flesh (or silicon, as it were).
Xiotech’s reps showed me a pre-recorded demo of the Emprise self healing process, including automated power cycling on a drive and the process of copying data off a drive to the others in its DataPac storage unit, remanufacturing the drive, and bringing it back online, restriping the data. Lots of blinky lights and bar graphs of I/O going up and down.
To say Xiotech officials are excited about Emprise would be a vast understatement. But in the midst of discussing power supply and airflow designs, SCSI command sets and their varying quality from device to device, future storage media such as solid state drives, and parallelized application performance, a little light bulb suddenly went off in the back of my mind.
“What ever happened to Daticon?” I asked. I’ll admit it was something of a non sequitur but it occurred to me at random.
There was a pause. Marketing communications guy looked at CTO Steve Sicola, Sicola looked back at marketing communications guy. “Well, there was a press release last week…”
Last week I was dead to the world beyond Symantec, but it doesn’t appear this press release was exactly heavily broadcast, either: as of June 6, Daticon has been sold to Electronic Evidence Discovery Inc. (EED). According to Xiotech director of marketing communications Bruce Caswell, “the opportunity to buy [Seagates Advanced Storage Architecture (ASA) group] came to light about a year ago, and we had two opportunities to pursue: e-Discovery and storage. We had to decide what we really wanted to pursue.”
He added, “that’s why we announced some evidence management solutions with Daticon and then sort of went dark.”
Xiotech also went dark for about nine months before the Daticon acquistion. At the time, Mike Stolz, vice president of corporate marketing, said “adding this functionality gets us out of day-to-day combat with EMC and IBM…evidence management and data discovery evolve around the storage system but at a higher level.” That made it appear that Xiotech would transform from a general storage array vendor to an ediscovery specialist.
Now Xiotech appears to be putting all of its resources into the Emprise and and its relationship with drive vendor Seagate, which owned Xiotech at one time and remans the sole drive supplier for the Emprisse (it has to be for the drive diagnostic firmware to work). Generally, array vendors use more than one manufacturer to force better pricing and overcome manufacturing anomalies, which crop up from time to time for particular suppliers.
Sicola says Xiotech has a contract with Seagate made to keep raw material costs competitive, but otherwise Xiotech makes no apology for slightly more expensive components, whch also include fans and power supplies engineered to use the same bearings as the disk drives, cutting down on vibration within each DataPac. Xiotech argues that spending more on better parts cuts down on failure rates, SCSI errors and services costs. “You can build a better mouse trap, but you need better parts,” he told me today.
As for manufacturing anomalies affecting whole batches of disk drives, “even when they reach epidemic proportions, they affect 10% of the product on the market,” Sicola said. “Problems with vibration, cooling and bad controller software make them worse–we want to fix that stuff by getting down to clean code.”
What do you think? Does that approach sound risky, or clever? Does ISE seem like another false start a la Daticon, or is it really the next big thing for Xiotech?
Better late than never. Backup software vendor Atempo has ventured into the email archiving market by coming out with the first full integration into its product line of intellectual property it acquired with Lighthouse Global Technologies in February.
Obviously, the release of an email archiving product is hardly earth-shattering. That market is headed for maturity very rapidly. Atempo knows this, which is why it actually released its email archiving software, called the Atempo Digital Archive for Messaging (ADAM), after releasing its file archiving software (ADA).
The first edition of ADAM will be integrated with ADA from the get-go. Atempo has also included features that not all of its predecessors have, such as message stubbing and support for Lotus email. But it’s the file archiving integration, according to Atempo’s VP of marketing Karim Toubba, “that shrinks a competitive landscape of more than 20 players down to just a few.”
In another attempt to differentiate ADAM, Atempo is using search from Exalead, rather than the more commonly used FAST or open-source search engines. This allows for automated retention according to message header info for e-Discovery.
“The downside for Atempo is that their brand is associated with Apple and the Mac,” said Enterprise Strategy Group analyst Brian Babineau, referring to Atempo’s TimeNavigator backup software. “On the positive side, they have a strong European and channel presence.”
The march of 2.5-inch SAS drives into networked storage took another step today when Dell launched its PowerVault MD1120 storage expansion enclosure. The enclosure is designed with the small form factor drives.
The MD1120 expansion closure isn’t a SAN, but a JBOD that connects to Dell’s PowerEdge server. And it’s not the first external storage system with 2.5-inch SAS drives — Infortrend has been shipping one since January. But Dell will obviously drive a lot more adoption than Infortrend, and Dell execs expect 2.5-inch SAS drives to co-exist with 3.5-inch drives in SANs before long.
“We see 3.5-inch drives being relevant for a long time in external storage, with 2.5-inch becoming a relevant complement in the next few years,” said Howard Shoobe, Dell’s senior manager of storage product management.
Small form factor drives allow for denser enclosures and reduce power consumption, but capacity is the main inhibitor for their inclusion in enterprise SANs. The new Dell enclosure holds 24 drives that are either 10,000 RPM 146GB or 15,000 RPM 73GB models. Shoobe expects the tipping point to come when 300 GB SAS 2.5-inch drives are shipping. Seagate has announced a 300 GB 2.5-inch drive that should begin shipping in systems later this year. Shoobe says Dell will incorporate them into the MD1120 when they’re available.
“The capacity we offer today will double, and that’s the trigger point,” he said.
Dell isn’t giving a forecast on when we might see 2.5-inch drives in its EqualLogic PS iSCSI SANs, and certainly not for the Clariion systems it co-markets with EMC. But even if it takes longer than expected to show up in enterprise SANs, Dell sees 2.5-inch SAS helping to give a new life to DAS because of the small form factor and coming bump from 3 Gbps to 6 Gbps.
“We have invested in DAS while the rest of the industry has been abandoning it,” said Praveen Asthana, global director of storage and networking for Dell. “DAS boxes are becoming more capable, especially with SAS. Why are we getting excited about a DAS announcement? It’s big business for us, and it’s growing.”
It never ceases to amaze me how easy it is sometimes to turn grown men into kids again. IT geeks gathered at the robotics competition at Symantec Vision actually giggled with delight at the mechanical violence.
The Geek Squads battle it out Continued »
Copan has enhanced its Revolution 300 Series to beef up its virtual tape library capabilities (VTL) and try and keep distance between the MAID pioneers and those who have followed with disk spin-down products.
The most interesting Revolution enhancements involve data deduplication that Copan added late last year for its VTL via an OEM deal with FalconStor. Now Revolution customers can set up a 40-drive cache landing zone, which supports more than 1,000 concurrent data streams. Up to 40 drives will run separate from the MAID pool, so those drives always spin and increase the ingestion rate while deduping. The cache will spin down with the rest of the drives after it finishes ingestion.
Copan also added a hot standby deduplication option that provides a spare dedupe engine that replaces a failed unit for high availability deduping.
Other enhancements include support for 1 TB SATA drives that bring maximum capacity to 896 TB in a single frame, data shredding to destroy tape data and tape caching to automate moving data from the VTL to physical tape.
Copan was the first to deliver MAID systems in 2004, but rivals Nexsan, Hitachi Data Systems, and EMC have since come out with their own spin down drives. But Copan CTO and founder Chris Santilli says there is more to MAID than spindown, and the new enhancements make Copan’s MAID more enterprise ready than the competition.
“MAID does not equal spin down,” Santilli said. “There’s more to it than saving power by spinning down drives. Enterprise MAID is a combination of density and reliability, and adding software services and features. We ingest as fast as we can, stage the data on MAID, and now we can do dedupe, replication, and encryption on the data.”
Analyst Mark Peters of Enterprise Strategy Group agrees the caching and other enhancements make MAID more valuable than merely spinning down disks.
“Caching shows they understand there are people who want to get a lot of data in their system fast,” Peters said. “If 25 percent of your disks are doing something else, that creates a problem. They created a side stream to the main river. This is a special section where you can keep the drives on.”