Oracle is getting into the archiving game with the Oracle Universal Online Archive, which will archive email as well as unstructured files. The product will use Oracle’s own database as the underlying infrastructure, with Oracle Fusion Middleware on top for data ingestion and user interface.
Despite the name, the product is on-site software. There will also be an email-only option, Oracle E-Mail Archive Service, which supports Exchange, Notes and SMTP mail. The products are expected to be available sometime this year. The Universal Archive goes for $20 per named user or $75,000 per CPU, while the Email Archive is priced at $50 per named user or $40,000 per CPU.
Not only am I not surprised to see Oracle get into the data archiving space, to be honest, I’m wondering what took them so long. And while writing the previous paragraph, I said “Ouch” a few times–when it was noted that Oracle can archive multiple content types in one repository, which most third-party archivers can’t do yet; when it was noted that Oracle can support not only Notes but SMTP on top of Exchange, which most third party archivers can’t do yet; and again when I saw the steep pricing.
Be that as it may, it’s been well known that databases like SQL are the basis for most third-party archiving software today. It’s also been well known that customers are catching on to archiving for database data as well. Finally, it’s bleedin’ obvious that Exchange is the dominant email platform and the dominant focus in email archiving. And I’ve wondered for a long time why companies like Oracle and Microsoft didn’t get in on this, since they have what seems like a slam dunk: ownership of the application and core technology, and mighty brand power that could conceivably crush the third-party market.
Easy, there, killer, was the response from ESG analyst Brian Babineau, who studies the archiving space. He pointed out that database archiving systems have to understand both the underlying database structure and the overlaying application, something Oracle isn’t doing. They may have an 800-lb. gorilla brand, he said, “but they have a tougher fight because there are native database archiving and native enterprise application vendors.”
To me this still leaves open the question of why Microsoft doesn’t just add archiving to Exchange, but Babineau pointed out the folks from Redmond already dipped a toe into the archiving market with FrontBridge and didn’t get too far. But I still have trouble believing that the Exchange archiving market would last long if Microsoft were to make a stronger move, say by acquiring a company like Mimosa and making stubbing and archiving a part of the Exchange interface.
My previous post about the value-add of online backup got me thinking about another series of conversations I’ve had recently about data storage SaaS in general (more on the compliance and archiving side than in backup, per se).
One value prop I hadn’t really thought about was suggested to me today by Jim Till, CMO of a company called Xythos. Xythos began as a SaaS-architected content management product during the tech bubble, watched that bubble and the market for storage service providers burst, re-architected for on-premise deployment at midsized to large enterprises, and is just now coming full circle with a SaaS offering again. Till said that customers of Xythos’s online product tend to be small organizations or remote and branch offices of larger organizations.
But in addition to the bandwidth issue, Till said, the reason organizations cite for going to a service for storage has little to do with bandwidth or expertise. He says the uptake has been among organizations relatively small in manpower but in “knowledge manager” industries such as tech consulting, law, or medicine. “They tend to be organizations where the biggest challenge is that standard methods of content storage aren’t accessible to distributed groups of people, and they need to uniformly apply policy against distributed content,” he said.
Any organization with data that’s widely distributed is unlikely to have a lot of data in one place. But it’s the distribution of that data, not its size or the experience of data management staff, that makes SaaS make sense, at least from Till’s point of view.
At least one recent case study I did on email archiving SaaS is consistent with this picture, too. For one of Fortiva’s email archiving SaaS customers, the Leukemia and Lymphoma Society, the problem wasn’t a 1.5 TB Exchange store, but 25,00 full and part-time employees receiving 12 million inbound messages a year at 103 different locations.
If this becomes a trend, the landscape of SaaS vendors might extend beyond traditional on-premise backup vendors to those who sell storage consolidation and accessibility over a wide area, such as Riverbed and Silver Peak.
Now, wouldn’t that be fun?
I was very happy to see one of my regular blog-stops, Anil Gupta’s Network Storage, pick up on a recent post I wrote–the one about HP’s new online storage services.
In his response post, Gupta picks up on this graf in particular:
Like most online storage offerings to date, this offering is small in scale and limited in its features when compared with on-premise products. Most analysts and vendors say online storage will be limited by bandwidth constraints and security concerns to the low end of the market, with most services on the market looking a lot like HP Upline.
there is nothing unique in most Online Backup Services that couldn’t be in traditional backup for laptop/desktop. At least traditional backup also come with peace of mind that all backups are stored on company’s own infrastructure. In last few years, I tried over a dozen online backup services in addition to putting up with traditional backup clients for laptop/desktop and I don’t see much difference among the two.
IMO, most online backup services are just taking existing on-premise backup strategy for laptops/desktops and repackaging it to run backups to somebody else’s infrastructure instead of your own.
I see what he’s saying, but in my opinion Gupta probably has “too much” experience with backup clients to necessarily see things from the SMB customer’s point of view. For him, installing a backup client isn’t a big deal–for some, it might be enough of a reason to let somebody else deal with it. Or at least, backup SaaS vendors are hoping so.
In case you’re like me and can’t get enough of the technical nitty-gritty on the new self-healing storage systems from Atrato and Xiotech, here are some tidbits from the cutting room floor so to speak, that didn’t make it into the article I did this week comparing the two systems.
This in particular was a paragraph that could have been fleshed out into a whole separate piece: “Both vendors use various error correction codes to identify potential drive failures, and both said they can work around a bad drive head by storing data on the remaining good sectors of the drive.”
This is where I’m running into each vendor’s unwillingness to expose their IP, which is understandable, and so trying to get to the bottom of this may be a fruitless endeavor. But that’s never stopped me before, so here’s a few more steps down the rabbit hole for those who are interested.
Xiotech’s whitepapers and literature talk a lot about the ANSI T10 DIF (Data Integrity Field), which is part of how its system checks that virtual blocks are written to the right physical disk, and that physical blocks match up with virtual blocks. The standard, which is also used by Seagate, Oracle, LSI and Emulex in their data integrity initiative, adds 8K per 512K block with data integrity information. I asked Xiotech CTO and ISE mastermind Steve Sicola about what kind of overhead that adds to the system, but the only answer I got was that it’s spread out over so many different disk drives working in parallel that it’s not noticeable.
Then along comes Atrato, claiming to base its self-healing technology on a concept from satellite engineering called FDIR, for Fault Detection, Isolation and Recovery. The term was first coined, according to Wikipedia, in relation to the Extended Duration Orbiter in the 90’s.
An Atrato whitepaper reveals three standard codes used for the first step in that process–fault or failure detection. Among them are S.M.A.R.T., which, again according to Wikipedia, “tests all data and all sectors of a drive by using off-line data collection to confirm the drive’s health during periods of inactivity”; SCSI Enclosure Services (SES), which tests non-data characteristics including power and temperature; and the SCSI Request Sense Command, which determines whether drives are SCSI-compliant.
The thing about all of these methods is that they have existed long before either the ISE or Atrato’s Velocity array. There are, of course, key differences between the way the systems are packaged, including the fact that Xiotech puts the controller right next to groups of between 20 and 40 disk drives, and Atrato manages 160 drives at once, but when it comes down to the actual self-healing aspects, the vendors are not disclosing anything about what new codes are being used to supplement those standards.
As Sicola put it to me, “What we’re doing is like S.M.A.R.T., but it goes way beyond that.” How far ‘way beyond that’ actually is, is proprietary. Which is kind of too bad, because it’s hard to tell how much of a hurdle there would be to more entrants in this market.
An analyst I was talking to about these new systems said some are talking about them as a desperation move for Xiotech, which has not exactly been burning down the market in recent years (it reinvented itself once already as an e-Discovery and compliance company after the acquisition of Daticon, which I haven’t heard much about lately).
Then again, others point out, Xiotech has Seagate’s backing (and can start from scratch with clear code on each disk drive, as well as use Seagate’s own drive testing software within the machine. Meanwhile, the ability to adequately market this technology has also been called into question with regards to Atrato.
But while it’s obviously going to take quite some time to assess the real viability of these particular products, it’s exciting for me as an industry observer to see vendors at least trying to do something fundamentally different with the way storage is managed. I think both of them share the same idea, that the individual disk drive is too small a unit to manage at the capacities today’s storage admins are dealing with.
Even if the products don’t perfectly live up to the claims of zero service events in a full three or five years, as ISE beta tester I was speaking with put it, “anything that will make the SAN more reliable has benefits.” It’s pretty easy to get caught up in all the marketechture noise and miss that forest for the trees.
Even further reading: IBM’s Tony Pearson is less than enthused (but has links to lots of other blogs / writeups on this subject)
The inimitable Robin Harris summarizes his thoughts on ISE, and gets an interesting comment from John Spiers of LeftHand Networks (another storage competitor heard from!).
This blog is about three months in the making.
First, a bit of background. Several posts ago, I predicted the death of SATA in favor of SAS, which is only marginally more expensive (not talking the dirt-cheap integrated SATA controllers, but higher-end cache-carrying SATA RAID controllers) for an admittedly smaller capacity but much higher speed.
After using SAS on some of the servers and blades at work, I came home to my SATA-based desktop computer and wept silently whenever I did anything disk-intensive, because it was soooooo much slower. I have SCSI for the OS in all my server equipment, but even those machines weren’t as peppy as the SAS stuff at work. Taking these two things into account, plus the fact that the games I like to play are all disk I/O intensive, then throwing in a bit of friendly rivalry for good measure, I decided to upgrade my desktop machine to use SAS storage.
It’s fairly routine for EMC to certify a multitude of different products as interoperable with its own, based on customer requests. But a recent press release about official compatibility between EMC and a Linux-based mail server positioned as an alternative to Microsoft Exchange made me pay more attention than I usually do to such proclamations.
One thing especially sticks out from this arrangement: several EMC customers, with plenty of Microsoft integration available from EMC’s product line, have instead chosen to go with this alternative mail server. From a startup called PostPath, no less.
Moreover, Barry Ader, EMC’s senior director of product marketing, acknowledged that there are several customers who have asked for the integration. “There are a handful I’m aware of, but there may be more,” was as specific as he would get, but he added, “They tend to be important customers to drive this kind of application work for us.”
EMC’s “important” customers tend to be large. In my book, if more than one important EMC customer is catching on to a product, it might be worth paying attention to.
In and of itself, PostPath’s application is a little bit outside our realm in storage, but it’s the way that the mail server handles storage that chiefly sets it apart from Exchange. According to CEO Duncan Greatwood, PostPath uses a file system (NFS or XFS depending on how servers are attached to storage) rather than the JET database, which allows for more efficient indexing schemes and a more organized layout of data on disk The JET database, which was never designed for the kinds of workloads enterprise Exchange servers are seeing today, has a deadly sequential-reads-with-random-writes issue slowing its storage I/O. PostPath also does a single write when a message is received, as opposed to Exchange, which writes blocks to multiple areas of storage based on different database fields with each message.
What all of this means is that attached to the right storage (ahem), PostPath allows email admins to offer virtually “bottomless” mailboxes to users.
Still, Greatwood acknowledges that he has an uphill battle on his hands. “Most of the Linux-based mail server alternatives to Exchange have not gone very far,” he said. But he maintains a key difference with PostPath is that the product speaks the same language as familiar Microsoft peripherals such as Outlook and Active Directory, so end users don’t have to stop using the tools they’re comfortable with. He also says that with all of Microsoft’s recent antritrust woes, especially in Europe, they’re not keen on crushing upstart competitors lately.
I know that storage managers (to say nothing if admins who have managed Exchange) have been looking for a better mousetrap for quite some time. And cozying up to EMC customers can’t be hurting PostPath’s cause.
HP has taken the wraps off a new online storage service for consumers and small offices, called HP Upline. The service has three levels: Home and Home Office, Family Account and Professional Account. Home accounts include one license, unlimited storage, online backup and basic support for $59 per year; a family account adds 3 licenses and a management dashboard for $149 per year; and a professional account gets 3 licenses, expandable to 100, as well as priority support.
The product is limited to PCs and doesn’t include some of the more advanced features being offered by online storage services such as file versioning. However, it does offer users the ability to tag content for later search and share, and to publish files online using the service through a feature called the Upline Library.
Like most online storage offerings to date, this offering is small in scale and limited in its features when compared with on-premise products. Most analysts and vendors say online storage will be limited by bandwidth constraints and security concerns to the low end of the market, with most services on the market looking a lot like HP Upline. Symantec has focused its backup software as a service (SaaS) within its Windows-centric Backup Exec product, traditionally sold into smaller shops; EMC’s Mozy Enterprise service, despite the name, is at this point recommended only for workstation-level backup. However, a “hybrid” approach for larger shops is now being proposed by EMC.
Wading into bickering between vendors is always fun. My most recent go-round with this has been the AutoCAD compatibility debate between Silver Peak and Riverbed. It began with the difficulties Riverbed users were seeing with optimizing AutoCAD 2007 and 2008 files, and progressed into a weeklong followup process culminating in a conference call between me, Riverbed VP of marketing Alan Saldich, Riverbed chief scientist Mark Day, Silver Peak director of product marketing Jeff Aaron, and Silver Peak CTO David Hughes, which led to this story.
Don’t think this drama’s over yet, either. While on that rather unusual conference call they seemed to reach a consensus that further testing is necessary on both products, neither company has stopped sending little hints my way since that the other guy’s full of it. Meanwhile, another contact I spoke with for the followup story wrote me late last week to suggest they’re both perhaps piling it higher and deeper.
“After reading the back and forth between Silver Peak and Riverbed, and finding neither firms’ claims especially credible, we’ve put forth a public offer to test in a controlled environment,” wrote James Wedding, an Autodesk consultant who blogs at Civil3D.com. “Shockingly, neither company has responded or replied. We have visitors logged from both firms, so they are reading, but no takers. Color me shocked that neither firm wants independent testing on this problem that will continue for a minimum of another year as Autodesk decides to make a change to accommodate the WAN accelerator market.” The Taneja Group has also offered to carry out testing, also with no discernable response from the vendors.
We’re ready when you are, guys.
After I covered the launch of Atrato’s self-maintaining array of identical disks (SAID) product, there were some unanswered questions, which I blogged about last week. Shortly after that, I had a followup call with Atrato’s chief scientist Sam Siewart and executive vice president of marketing Steve Visconti to tie up at least some of the loose ends.
There were inconsistent reports on the size of disk drives used by the Atrato system; the officials confirmed they are using 2.5-inch SATA drives.
More detail was desired on exactly what disk errors the system can fix and how. That’s proprietary, Siewart said, which I’d anticipated, but he gave one example – the system can do region-level remapping on a drive with nonrecoverable sector areas. The system also runs continual diagnostic routines in the background and can fix such problems proactively, meaning it can find such an error and force the controller to remap the sector before a user’s I/O comes in.
I asked them if their IOPS number (ranging anywhere from 10,000 to 20,000 IOPS depending on which news source or press release you were reading) was achieved through testing or estimated based on a 512KB block size. To which they replied, “definitely tested.” The average result of those tests was about 11,000, though their mysterious government customer reported 20,000 with a tuned application. “What we need to do is have discipline in talking about approximately 11,000 and then describing how the results may vary,” Visconti said of the inconsistent numbers that appeared in the first round of Atrato coverage. The bandwidth is about 1.2 GBps.
Part of the problem when it comes to talking about this system is that so many of the parameters are variable, including price. “Pricing ranges from the low hundreds [of thousands of dollars] to upwards of 200K depending on the configuration,” Visconti said. So in a way, all of us who reported anything in that range were right. Performance is the chief configuration factor that influences price–a system that’s populated with 160 slower, higher-capacity drives will be more at the $100,000 end of the range. “Most customers are opting for higher-speed 7200 RPM SATA drives,” Siewart said. Added Visconti, “We shouldn’t be quoting a precise number.”
Clarification on the 5U/3U question – 3U is the size of the storage array, but doesn’t include the size of its controller, which might be either 2U or 3U depending on whether or not it’s an x3650 from IBM (2U) or a souped-up one from SGI (3U). Atrato’s developing its own controller to be announced later this year.
The array attaches to servers using a modular front-end that today is shipping with a 4 Gbps FC interface that can also accomodate NFS. “We’re close on 8-gig Fibre Channel,” Siewart said, and working on iSCSI, 10 GbE and InfiniBand as well. Distance replication and mirroring also remain on the roadmap.
Meanwhile, it seems Atrato is looking for marketing help.
If you’ve been following the data archiving and compliance markets, you’ve probably heard the consensus that the real boom in software as a service (SaaS) will come from small to midsized businesses (SMBs). That’s the prevailing wisdom among analysts, anyway, as far as I’ve heard.
But EMC revealed today at a Writers Summit in Boston that it intends to push its Fortress-based SaaS offering into the high end space with a hybrid approach to on-site and off-site archiving.
The event today was unusual, at least compared with the rest of my experience in the industry. There were no end users or high-profile industry analyst firms represented and hardly any trade press, either. Most of the attendees were technology writers from new mediums such as blogs and Wikis. EMC executives explained that they wanted the summit to be an interactive discussion around industry trends (read: free help for their marketing research?).
It was an odd situation for me, since I’m used to listening and asking questions at industry events, rather than offering opinions. Along the way, though, the EMC execs dropped a few nuggets about their plans. Convergence was a pervasive theme–and the SaaS plans fit into it. EMC predicted a convergence not only between traditional technologies and new mobile technologies (that’s why they bought a stealth startup with no product on the market yet, in Pi) but between on-site and off-site data repositories.
The new aim of the Content Management and Archiving unit at EMC is to use Documentum to unify pieces of its archiving portfolio (CMA president Mark Lewis says EmaileXtender will be integrated into Documentum by mid-year), and also to unify those repositories. Lewis and Documentum founder Howard Shao, now EMC senior VP of CMA, said in their view there are four factors influencing this approach: enterprise content management and archiving place significant demands on outsourced infrastructures, especially when it comes to network bandwidth; companies are wary of letting sensitive, regulated data outside their firewalls; any application you’d want to deliver through SaaS is inextricable from applications that remain on-site; and that the volume and value of archival storage dictates a tiered approach.
This sparked some debate among some of the pundits at the meeting. Carl Frappaolo, VP of market intelligence for enterprise content management (ECM) industry association AIIM, pointed out that the biggest reasons companies resist deploying ECM is because of complexity. “Aren’t you just adding complexity to the equation?” he asked. Shao countered that a complex problem or a complex back-end doesn’t mean that management can’t be simple.
Kahn Consulting Inc.’s Barclay Blair piped up in support of Shao’s view that users will be wary of letting certain data outside their firewalls, but said “our clients would be attracted to a model that keeps the information on-site, but has the applications which manage the information being managed for them by someone else.”
Countered Frappaolo, “If EMC is doing its job right, shouldn’t users be willing to trust data to them? The whole idea is that you’re supposed to be better at security than me, and I should trust you to keep from exposing private data both inside and outside the data center.”
At any rate, the upshot according to Lewis will be a rollout of this hybrid ECM SaaS model by the end of this year. Another thing I got out of this discussion, with all its focus on security and privacy within a multitenant repository, is a clearer reason why EMC spent all that money on RSA.