I had a brief conversation last week with Ed Chapman, Riverbed’s VP of cloud storage acceleration products, hired away from Cisco in May. Chapman (and senior vice president of marketing and business development Eric Wolford, who chimed in frequently) stopped short of divulging much detail on the planned Cloud Storage Accelerator product, but did offer some new information about its origins…
What are the goals for Riverbed’s cloud business this year?
Wolford: We haven’t launched yet, but we told our user group about the Cloud Storage Accelerator that Ed is going to head up getting to market. It’s a bit of a phoenix from the ashes of Atlas in part, and Steelhead in part, and some new development, in part. I don’t want you to think it’s the same product [as Atlas]– we took key components from that product and put them into the Cloud Storage Accelerator, to sit in the data center and accelerate access to AT&T Synaptic Storage, Amazon S3, etc.
While Riverbed’s working on its cloud product, other vendors like Nasuni, TwinStrata and StorSimple are already out in the market selling cloud storage gateway appliances to interface securely – and in some cases with deduplication features – with the cloud. How does Riverbed plan to differentiate its cloud product against those offerings?
Chapman: We are not going after the same market segments they’re going after. We’re going after a focused market segment that we think is more applicable in the marketplace. Just to give you a viewpoint of what customers are looking for in general, if you look at the application of new storage technologies in the market, it sort of follows a hierarchy…[users say] ‘maybe I’ll use this for backup’… then archive…then they look at all the rest, including primary storage. What we’ve heard from our customers that we’ve spoken to, they want to utilize cloud storage infrastructure in the same sort of mechanism, backup and archive and then going down the rest of the hierarchy from a storage perspective. Our goal looking at the marketplace is to leverage things our customers will want to leverage and utilize first, along the parameters of backup and archive rather than primary storage filer replacement.
Doesn’t Riverbed already allow replication to cloud services for DR? How is this different from that?
Wolford: Well…we’re just going to have to wait.
Similarly, EMC and other vendors are working on systems like VPlex and Atmos, which they claim can replicate data at scale among data centers, with little mention of WAN optimization technology as a necessary component of the infrastructure. Do those products represent a threat to Riverbed’s market? What would your role be in that environment?
Wolford: We’ll help them. The parallel I would make is that when cloud first came out, nobody mentioned WAN optimization. Nobody mentioned, ‘we have this great product but unfortunately, it has this problem.’ Any time users have distance between them and their data they have a problem. There’s been a trend toward consolidation in remote offices and data centers – the cloud is a variant of that reality.The more that reality occurs, we are just lovin’ it because it spotlights the performance problem.
Chapman: EMC has been selling SRDF and SRDF/A and never said WAN optimization is needed, but we just won EMC Select Channel Partner of the Year because we could be used as the primary WAN optimization tool with SRDF/A. So my point there is while EMC talks and launches VPlex and talks about distributed cacheing — and I think it’s a fabulous technology — that doesn’t mean we’re not going to be able to add a lot of value to it the way we have with SRDF, SRDF/A and other technology used for replication.
Emulex today acquired its 10-Gigabit Ethernet silicon partner ServerEngines for $78 million in cash and eight million shares of Emulex stock to be issued at closing, which could bring the price to around $160 million.
The deal is expected to close next month. The Emulex shares would be worth $81 million using Friday’s closing price of $10.11.
Emulex sold ServerEngines silicon with its OneConnect Universal Converged Network Adapters (UCNAs) adapters for the past two years as part of an OEM and joint development deal between the two. ServerEngines contributed the Ethernet and iSCSI portions of the ASICs for the converged adapters.
Emulex CEO Jim McCluney says ServerEngines gave Emulex a fast path to combining Etherent with its own Fibre Channel stack, and now that the UCNAs are coming to market it makes sense to bring the technology in-house. That gives Emulex more control of the technology.
“We saw this as a new game where the old rules didn’t apply,” McCluney said of converged networking. “Instead of repurposing our Fibre Channel ASICS, we wanted to do something unique. We went out and found best of breed 10-gigabit ASIC to combine with our own Fibre Channel stack.We felt the time was right to take things to the next level.”
ServerEngines has two main product families — the BladeEngine 10-GigE ASICs that Emulex uses in OneConnect, and a Pilot family of server management controllers soled by Cisco, Hewlett-Packard, NEC, and Unisys. McCluney said the Pilot products will bring Emulex around $4 million in revenues each quarter.
ServerEngines has about 170 employees, mostly engineers based in Sunnyvale, CA, Austin, TX, and Hyderabad, India.
The reorganization of the Storage and Availability Management group within Symantec was the source of widespread speculation in April that the vendor was preparing to divest entirely from the storage busines. Layoffs in the first quarter affected engineers working on Veritas file system, Backup Exec System Recovery (BESR), Enterprise Vault and Storage Foundation. The speculation prompted the vendor to post a statement on its website reiterating its commitment to its storage products.
“A lot of the rumors were way way wild in my view, a lot of the numbers were way off,” Anil Chakravarthy, the storage and availability management group’s new senior vice president, told me this week. He would not disclose the actual numbers, but said there are 920 people in the business unit worldwide now, include 860 engineers.
He confirmed earlier reports that much of the engineering work in this product group has been moved to Symantec’s development centers in China and India, though development continues at Mountain View, Calif. Chakravarthy previously headed up Symantec’s global services division, he said, but prior to that he ran Symantec’s Indian engineering business.
The reorganization came about — at least in part– after the Unix server market “took a sharp dive last year, and we took a dive along with it,” Chakravarthy said. Unix represented the biggest market for Symantec’s storage management and file system products.
For more on Symantec’s specifric plans for its Storage Foundation and Veritas Cluster Server (VCS) products, see our story on SearchStorage.com.
According to a press release circulated this morning by Storage Newsletter, storage software startup Seanodes has sold off its assets to a French VMware distributor called Deletec after filing for bankruptcy last December.
The report says Deletec will continue to sell Seanodes’ Exanodes software, which turns commodity servers into iSCSI SANs.
Calls to phone numbers listed on the Seanodes website for its offices in Cambridge, MA and France returned a busy signal and “please check the number and dial again” message, respectively. Meanwhile, the most recent press release on the Seanodes website is from last September, and one source close to the company speaking on condition of anonymity has confirmed the company went out of business after failing to get more funding last fall.
A receptionist answering the phone at Deletec’s Paris office said no one at the company would be available for comment until next week.
We’ve also heard reports of funding troubles for cloud storage vendor Parascale. Another source who asked not to go on record says Parascale failed to get Series B funding and is being reorganized. The company quietly changed CEOs last month, bringing in Ken Fehrnstrom to replace Sajai Krishnan. Nobody at Parascale was available for official comment on this Friday before Memorial Day, either.
Yesterday I met with execs from a company called Gluster, which is developing an open-source, software-only, scale out NAS system for unstructured data. As we discussed their market, products and competitors, we got into the nitty gritty of their technical differentiation as well – pasted below is an extended Q&A with CTO and co-founder Anand Babu Periasamy about Gluster’s way of handling metadata, most often a bugaboo when it comes to file system scalability.
Beth Pariseau: So as I’m sure you’re aware, there are many scale-out file system products out there addressing unstructured data growth. What’s Gluster’s differentiation in this increasingly crowded market?
ABP: What we figured out was that centralized and distributed metadata both have their own problems, so you have to get rid of them both. That’s the most important advantage when it comes to the Gluster file system. The reason why we got to a production-ready stage very quickly – we wrote the file system in six months and took it to production, because a customer had already paid for it, and they had a desperate need to scale with seismic data that was very critical, and they could no longer reason with that data because it was all sitting on tapes. I looked around, there was no file system around – the file systems they had used before were for scratch data, they had found a scalability advantage [to scale-out], but the problem was metadata.
The problem with the metadata is if you have centralize metadata it becomes a choke point, and distributed gets extremely complicated, and the problem with both is if your metadata server is lost, your data is gone, it’s finished…became very clear we had to get rid of the metadata server. The moment you separate data and metadata you are introducing cache coherency issues that are incredibly complicated to solve. By eliminating the need to separate data and metadata we made the file system resilient. On the underlying disks, you can format them with any standard file system – we don’t need any of its features. We just want the disk to be accessible by a standard interface, so even tomorrow if you don’t like the Gluster file system or there is a serious crash in your data center, you can just pull the drives out, put them in a different machine and have your data with you – you’re not tied to Gluster at all. Because we didn’t have any metadata the data can be kept, as files and folders, the way users copied it onto the global namespace.
Within the file system the scalability problem became seamless because we didn’t have to put a lock on metadata and slow down the whole thing, we can pretty easily scale because every machine in the system is self-contained and intelligent, equally, as all other machines. So if you want more capacity, more performance, you just throw more machines at it, and the file system pretty much linearly scales, because there’s nothing centrally holding the scalability.
BP: So it’s an aggregation of multiple file systems rather than one coherent file system that has to maintain consistency?
ABP: No. The disk file system is just a matter of formatting the drives. The Gluster file system is a complete storage operating system stack. We did not rely on the underlying operating system at all, because we figured out very quickly [things like] memory manager, volume manager , software RAID, we even already support RDMA over 10 Gigabit Ethernet or InfiniBand, you pretty much have the entire storage operating system stack that’s a lot more scalable than a Unix or Linux kernel. We treat the underlying kernel more like a hypervisor, or a microkernel and don’t rely on any of its features. By pushing everything to user space we were able to very quickly innovate new complicated things that were not possible before and pretty much scale very nicely across multiple machines.
Gluster VP of marketing Jack O’Brien: The three big architectural decisions we made early on…one is that we were in user space rather than kernel space and the second is that rather than having a centralized or distributed metadata store, there’s this concept called elastic cacheing where essentially you algorithmically determine where the data lies and the metadata is with the data rather than being separated. And the third is open source.
BP: Did you see the EMC announcement about VPlex or are you familiar with YottaYotta and what they did with cache coherency, having a pointer rather than having to make sure all data is replicated across all nodes? Is it similar to that?
ABP: What it sounds like they’re describing is basically asynchronous replication with locking, that’s how they bring you the cache coherency issue. But what I explained was, the file system is completely self-contained and distributed so we don’t have to handle the cache coherency issue. The cache coherency issue comes when you separate the data and metadata so when you’re modifying a file you have to hold the lock until a change appears…because we don’t have to hold metadata separately, we don’t have to hold the lock in the first place because we don’t have the cache coherency issue.
JO: Another way to think of it is, every node in the cluster runs the same algorithm to identify where a file’s located and every file has its own unique filename. The hash translates that into a unique ID—
BP: Oh, so it’s object based.
ABP: It is a hashing algorithm inside the file system, but for the end user it’s still files and folder.
BP: But this is how Panasas is, too, right? Underneath that file system interface to the user they have an object-based system with unique IDs..
ABP: But those IDs are stored in a distributed metadata server. We don’t have to do that.
JO: Our ID is part of the extended attributes of the file itself.
ABP: The back end file name is already unique enough, you don’t really need to store it in a separate index in a separate metadata server, we figured out we can come up with a smarter approach to do this. The reason [competitors] all had complications is because they parallelized it at the block layer, basically they took a disk file system and parallelized it, it’s a very complicated problem…you should parallelize at a much higher layer, at the VFS layer, and have a much simpler, more elegant approach.
JO: So a node doesn’t have to look up something centrally and it doesn’t have to ask anybody else in the cluster. It knows where the file’s located algorithmically.
BP: I think that’s what’s giving me the ice-cream headache here. So each node has a database within it? The thing I’m sticking on is ‘algorithmically knows where to look for it?
ABP: At the highest level …given a file name and path that’s already unique, if you hash it it comes out to a number. If you have 10 machines, the number has to be between 1 and 10. No matter how many times from wherever you calculate it, you get the same number. So if the number is, for example seven, then the file has to be on the seventh node on the seventh particular data tree. The problem in hashing is when you add the 11th and 12th node you have to rehash everything. Hashing is a static logic, as you copy more and more data you can easily get hot spots and you can’t solve that problem. The others parallelize at the block layer and put the blocks across. Because we solve the problem at the file level, if you want to find a file…internally what happens is the operating system sends a lookup call…to verify whether the home directory exists and [the user] has the necessary permissions…and then it sends an open call on the file. Internally what happens is by the time the directory calls come, the call on the directory…has all the information about the file properties. We also send information about a bit map.
Instead of taking a simple plain hash logic which cannot scale…you don’t have to physically think that you only have 10 machines. You can think logically, mathematically, you can think you have a thousand machines, there is nothing stopping you from doing that, it’s the idea of a virtual storage solution. It’s like with virtual machines, you may have only 10 machines but you think you have a thousand virtual machines, so we mathematically think we have a thousand disks. It can be any bigger number, and the actual number is really big. Then we present each logical disk as a bit, so the entire information is basically just a bit array, and the bit array is stored as a standard attribute on the data tree itself. By the time the OS or application tries to open the file, a stat call comes and the stat call already has this bitmap, and the hash logic will index into a virtual disk which really doesn’t exist, it could be some 33,000th cluster disk. And whichever directory wants that bit, you know that the file is in that machine, and don’t need to ask the metadata server, ‘tell me where my block is, hold the lock on the metadata because I need to change this bit.”
BP: But then if two people want to write at the same file at the same time…
ABP: We have a distributed locking mechanism. Because the knowledge of files is there across the stack, we only had to write a locking module that knows how to handle one file.
Hitachi Data Systems’ latest earnings results show a modest year-over-year increase as the recession fades. They also show an interesting shift in HDS sales towards services and software.
Remember when HDS was known as a high-end storage array vendor with little software or services? That’s no longer the case. HDS’ $882 million in revenue last quarter increased 6% over the previous year, despite a “single digit” decline in revenue from its USP enterprise storage platform. The USP platform still makes up most of HDS revenue, but services accounted for 30% and software 15% of its revenue last quarter.
HDS VP of corporate marketing Asim Zaheer says services and software both increased in double digits over last year, as did the HDS Adaptable Modular Storage platform. File and content (archiving) storage grew 200%, thanks to a new midrange NAS system that HDS gets from its OEM relationship with BlueArc.
USP sales may be impacted by customers waiting for a widely anticipated product refresh, although HDS execs won’t confirmed any upgrades are coming They say the change in product mix reflects a shift in buying patterns away from traditional high-end enterprise arrays towards modular SAN and file-based storage, as well as tiering enabled by virtualization.
Claus Mikkelsen, CTO of storage architectures for HDS, says customers are combining the USP virtualization capability with lower-cost disk and using features such as dynamic provisioning to save money through tiering and prolong the life of their storage arrays.
“We view the USP as a virtualization engine, and not a storage array per se,” he said. “There is clearly a blurring of lines in terms of tiering storage. We’re starting to see this new tier one-and-a-half that seems to be emerging, bringing high-end feature sets to other use cases that traditionally have not been considered high end.”
Zaheer says the lower priced NAS system based on BlueArc’s Mercury platform “revitalized our NAS portfolio” by making it more attractive to mainstream shops. The HDS execs say their midrange ASM storage platform grew in sales each quarter since it was introduced in late 2008. Mikkleson says a lot of that storage is being used behind the USP virtualization controller.
“We used to talk about high-end customers and midrange customers, but I think that was the wrong way of looking at it,” he said. “It’s more a case of customers that have different needs. Now we have more native software support in the midrange with features such as replication, copy on write, and dynamic provisioning.”
Mikkleson also said customers are looking at storage costs differently now, too. “It’s no longer about dollars per gigabyte, that went out about 20 years ago,” he said. “Now you factor in storage, maintenance, power and cooling, and the burden rate for employees.”
Mikkleson says Oracle’s decision to end the Sun OEM deal for the USP platform won’t hurt sales.
“If customers used to buying Hitachi storage from Sun can’t do that anymore, they’ll buy it from HDS,” he said.
The Justice Dept. today said EMC paid $87.5 million to settle a lawsuit that charged the vendor with false pricing claims and taking part in a kickback scheme with consulting firms who do business with government agencies.
The Justice Dept. claims EMC committed fraud by inducing the General Services Administration (GSA) to enter a contract with prices that were higher than they should have been. The GSA purchases products for the federal government. The Justice Dept. said EMC claimed during contract negotiations that for each government order under the contract, the vendor would conduct a price comparison to ensure that the government received the lowest price provided to any of its commercial customers – claims EMC could not live up to because it could not make such price comparisons.
Under the kickback scheme detailed in the Justice Dept. press release, EMC paid consulting companies fees whenever the consultants recommended that a government agency buy EMC products. EMC is not alone here – the DOJ said it has settled with three other technology companies and other investigations are pending. It did not name the other vendors.
“Misrepresentations during contract negotiations and the payment of kickbacks or illegal inducements undermine the integrity of the government procurement process,” Tony West, assistant Attorney General for the Civil Division of the Department of Justice, said in the Justice Dept. release. “The Justice Department is acting to ensure that government purchasers of commercial products can be assured that they are getting the prices they are entitled to.”
EMC denied any wrongdoing when the charges were first made public in March of 2009, and an EMC spokesman today emailed a statement to StorageSoup saying the vendor “has always denied these allegations and will continue to deny any liability arising from the allegations made in this case. We’re pleased that the expense, distraction and uncertainty of continued litigation are behind us.”
The EMC spokesman said some of the charges are almost 10 years old.
Saying it’s looking to appeal to larger shops with its online data backup service, Iron Mountain Digital released version 7.0 of its LiveVault SaaS product today with new support for multithreaded applications and larger data sets.
Previously, LiveVault’s “sweet spot” was protecting servers up to 1 TB, according to Jackie Su, senior product marketing manager for Iron Mountain Digital. The new version will protect up to 7 TB thanks to beefier processors and memory in the LiveVault TurboRestore on-site appliance, and the Data Shuttle option becoming a built-in feature. Previously, if users wanted to transport large data sets on portable hard drives, it was done only on request in special circumstances. The new TurboRestore appliance can now hold up to 24 TB of disk, and has a 64-bit memory cache.
Iron Mountain claims it’s seen growing adoption of cloud data protection in midsized enterprises among its customer base for LiveVault, citing this shift as the reason for its scalability updates with this release, but did not provide a specific number of midsized customers, percentage of growth in those customers compared with last year, or average deal size, though chief marketing officer TM Ravi said deal sizes are growing, which “indicates we’re covering larger and larger environments.”
Online data backup so far has been among the most popular uses of cloud data storage, particularly among enterprise users, but according to Storage Magazine’s most recent storage purchasing survey, “it’s still more hype than happening”.
Hewlett-Packard Co. added another scale-out NAS system to its portfolio yesterday when it announced DataDirect Networks (DDN)’s S2A9900 disk array will be bundled with the Lustre File System resold by the Scalable Computing and Infrastructure (SCI) group within HP.
HP began collecting scale-out file systems when it acquired PolyServe in 2007, then saw some false starts with its ExDS9100 product for Web 2.0 and HPC use cases. HP continued its track record of acquiring its partners in the space with the acquisition of Ibrix last July. Yet HP still found a gap in its scale-out file system portfolio for DataDirect and Lustre with this agreement, according to Ed Turkel, manager of business development for SCI.
“Basically, both the X9000 [based on Ibrix] and [the new offering with] DDN are scale-out file systems sold as an appliance model,” Turkel said. But Lustre is geared more toward “the unique demands of HPC users” in which multiple servers in a cluster simultaneously read and write to a single file at the same time, requiring very high single file bandwidth. “The X9000 is more general purpose, with scalable aggregate bandwidth” rather than high single-file performance.
DDN’s VP of marketing Jeff Denworth said the two vendors have “a handful” of joint customers already, but Denworth and Turkel both dismissed the idea that DDN could be HP’s next scale-out acquisition. “If I respond to that question in any fashion, I’m probably going to get my hand slapped, but it’s certainly not the purpose of this announcement,” Turkel said. However, this product will replace a previous offering HP launched in 2006, also based on Lustre, called the Scalable File Share (SFS).
DDN is now partnered for storage with every large HPC OEM vendor there is — previously it has announced reseller and OEM relationships with IBM, Dell and SGI. “This sounds similar to the arrangement that DDN has with IBM, Dell and SGI to provide a turnkey solution to certain niche customers, more likely aligned with the HP server group than the storage group,” wrote StorageIO founder and analyst Greg Schulz in an email to Storage Soup.