Storage Soup

October 26, 2011  8:45 PM

Violin tunes up for Big Data analytics

Dave Raffo Dave Raffo Profile: Dave Raffo

Violin Memory CTO of software Jonathan Goldick sees solid state playing a key role in storage for Big Data, and he’s not talking about scale-out NAS for large data stores.

Goldick says solid-state drives (SSDs) can help run analytics for Hadoop and NoSQL databases better in storage racks than in shared-nothing server configurations.

“We’re focused on the analytics end of Big Data – getting Hadoop and NoSQL into reliable infrastructures while getting them to scale out horizontally,” he said. “Scale-out NAS is a different part of the market.”

Today, Violin said its 3000 Series flash Memory Arrays have been certified to work with IBM’s SAN Volume Controller (SVC) storage virtualization arrays. Goldick pointed to this combination as one way that Violin technology can help optimize Big Data analytics. The vendors say SVC’s FlashCopy, Easy Tier, live migration and replication data management capabilities work with Violin arrays.

Goldick said running Violin’s SSDs with storage systems speeds the Hadoop “shuffle phase” and provides more IOPS without having to add spindles. SVC brings the management features that Violin’s array lacks.

“Hadoop is well-optimized for SATA drives, but there’s always a phase when it’s doing random I/O called the ‘shuffle phase,’ and you’re stalled waiting for disks to catch up,” said Goldick, who came to Violin from LSI to set the startup’s data management strategy. “We’re looking at a hybrid storage model for Big Data. You’ve heard of top-of-the-rack switches, we look at Violin as the middle-of-the-rack array. It gives you fault tolerance and the high performance you need to make Big Data applications run at real-time speeds.”

He said Hadoop holds data in transient data stores and persistent data stores. It’s the persistent data – which is becoming more prevalent in Hadoop architectures – where flash can help. “So you think of Hadoop not just as analytics but as a storage platform,” he said. “That’s where IBM SVC bridges a gap for us. When data is transient you don’t need data management services as much. When you start keeping the data there, it becomes a persistent data store of petabytes of information. You need data management features that enterprise users have come to expect – things like snapshotting, metro-clustering, fault tolerance over distance.”

Violin’s 3000 series is also certified on EMC’s Vplex federated storage system. EMC is talking about Big Data more than any other storage vendor, with its Isilon clustered NAS as well as its Greenplum analytics systems. EMC president Pat Gelsinger last week said Big Data technologies will be the focus of EMC’s acquisitions over the coming months.

If Goldick is correct, we’ll be hearing a lot more about Big Data analytics in storage.

“Last year Big Data was about getting it to work,” he said. “This year it’s about optimizing performance for a rack. People don’t want to run thousands of servers if they can get the efficiency from a rack.”

There are other ways of using SSDs to speed analytics — inside arrays, or as PCIe cards in storage systems or servers. Violin’s Big Data success will be determined by its performance against a crowded field of competitors.

    0 Comments     RSS Feed     Email a friend

October 26, 2011  3:13 PM

Who makes the call on archiving?

Randy Kerns Randy Kerns Profile: Randy Kerns

Data archiving makes sense when primary storage gets filled up with data that is no longer active. Data growth on primary storage – the highest performing storage with the most frequent data protection policies – results in increasing capital and operational costs.

Organizations can save money by moving the inactive data or data with a low probability of access to secondary storage or archive storage. The question is, who owns the decision of what to move?

IT directors and managers I’ve talked to have a mixed response to that question. Some say it is the business unit’s decision, but IT cannot get a response from them about what data can be archived or moved to secondary storage. Others say that IT has the responsibility but does not have the systems or software in place to do the archiving effectively, usually because they lack a budget for this. And a few say it is IT’s responsibility, and they are in the process of archiving data.

Those who archive with the initiative coming from IT say it is important to make the archiving and retrieval seamless from the user standpoint. Seamless means the user can access archived data without needing to know that the data has been archived or moved. It’s acceptable if the retrieval takes a few extra seconds, as long as there are no extra steps (operations) added to the user’s access.

Implementing archives with seamless access and rules-based archiving by IT requires specific system capabilities. These systems must work at the file system (or NAS) level to be able to move data to secondary or archive systems, and then to retrieve that data.

External tiering, or archiving, is highlighted in the Evaluator Group report that can be downloaded here.  This is a major tool in the IT repertoire to help control costs and meet expanding capacity demands. The decision-process about archiving needs to be made by IT, but requires the system capabilities to make it a seamless activity for users.

(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).

October 25, 2011  8:06 AM

HDS rolls out private cloud services, eyes Big Data

Brein Matturro Profile: Brein Matturro

By Sonia R. Lelii, Senior News Writer

Hitachi Data Systems is putting technology from its BlueArc and Parascale acquisitions to work in its private storage cloud and Big Data plans.

HDS today upgraded its Cloud Service for Private File Tiering, and rolled out its Cloud Service for File Serving and Cloud Service for Microsoft SharePoint Archiving as part of its infrastructure cloud strategy.

HDS also outlined its vision for its infrastructure, content and information clouds. BlueArc’s NAS products will provide file storage capabilities in the infrastructure and content clouds while Parascale Cloud Storage (PCS) fits into the content and information clouds.

HDS acquired Parascale for an undisclosed price in August 2010 and bought its long-time NAS OEM partner BlueArc for $600 million last month.

HDS’ strategy is to make its content cloud a single platform for data indexing, search and discovery.

HDS rolled out its Private File Tiering service in June 2010 for tiering data from a NetApp filer to the Hitachi Content Platform (HCP). Now it adds HCP support for EMC NAS. The file service and SharePoint cloud services let users share files and SharePoint from different geographic locations over a LAN, WAN or MAN. These services require a Hitachi Data Ingestor (HDI) caching device in remote sites or branches to tier data to a central location that houses the HCP.

Tanya Loughlin, HDS’ manager of cloud product marketing, said these services already exist but now HDS is packaging them as a cloud that it will manage for customers. The cloud services include a management portal to access billing, payment and chargeback information.

“It’s a private cloud service,” Loughlin said “Customers don’t have to pay for hardware. They pay on a per-gigabyte basis. This is a way to augment staff and push some of the less-used data to us. We’ll manage it.”

Pricing is not available yet. “We are finalizing that now,” she said. “The products that fit into these services are already priced, so this is a bundling exercise now.”

HDS plans to tackle Big Data through its information cloud strategy by integrating analytics tools and processes into PCS. PCS aggregates Linux servers into one virtual storage appliance for structured and unstructured data. Loughlin said HDS will also use Parascale, BlueArc NAS and the HDS Virtual Storage Platform (VSP) SAN array to connect data sets and identify patterns for business intelligence in the health, life sciences and energy research fields.

October 24, 2011  8:18 PM

Quantum adds SMB NAS and backup, eyes the cloud

Dave Raffo Dave Raffo Profile: Dave Raffo

Quantum today took a break from upgrading its DXi data deduplication platform, and rolled out its first Windows-based NAS systems and expanded its RDX removable hard drive family. The SMB products include a new backup deduplication application.

Quantum launched two NAS boxes based on the Windows Storage System OS. The NDX-8 is an 8 TB primary storage system that uses an Intel Core i3 3.3 GHz processor and 4 GB of RAM with four 2 TB drives. The NDX-8d is a backup system based on the same hardware with Quantum’s Datastor Shield agentless backup software with data deduplication installed. The NDX-8d includes licenses to back up 10 Windows desktops or laptops and one Windows server or virtual server.

The NAS systems are available in 1U or tower configurations. Pricing starts at $4,029 for the NDX-8 and $5,139 for the NDX-8d.

Quantum also rolled out its RDX 8000 removable disk library, its first automated RDX system to go with its current desktop models. The RDX 8000 has eight slots for RDX cartridges, which range in capacity from 160 GB to 1 TB. The RDX 8000 comes pre-configured with Datastor Shield or Symantec Backup Exec Quickstart software.

The RDX 8000 costs $3,889 with Backup Exec and $4,999 with Datastor Shield. John Goode, director of Quantum’s devices product line, said he expects that customers will use two-third fewer cartridges with the Datastor Shield dedupe.

“We felt it was important with disk backup to use deduplication,” Goode said.

Datastor Shield has a different code base than Quantum’s DXi dedupe for its disk target systems. The biggest difference is it does a bit-level compare while the DXi software performs variable block dedupe.

Backup Exec Quickstart is good for one server. If customers need to backup more servers, they must upgrade to the full Backup Exec application.

Datastor Shield can replicate between NDX-8 and RDX boxes, and Goode said it will be able to replicate data to the cloud in early 2012. He said Quantum will offer customers cloud subscriptions and work with a cloud provider, and will also have cloud-seeding options.

October 24, 2011  3:28 PM

EMC changes the channel on Dell sales

Brein Matturro Profile: Brein Matturro

By Sonia R. Lelii, Senior News Writer

EMC president Pat Gelsinger said EMC had already moved on by the time Dell officially ended their storage partnership last week after a 10-year relationship.

Gelsinger said it was no secret that EMC’s partnership with Dell had to drastically change or end after Dell expanded its storage presence by acquiring EMC competitors EqualLogic and Compellent.

“It got to a natural point where the relationship had to be restructured or it had to come to an end. Unfortunately, it came to an end,” Gelsinger told a group of reporters last Thursday at EMC Forum 2011, held at Gillette Stadium in Foxborough, Mass.

Dell sold EMC’s Clariion, Celerra, Data Domain and VNX systems through OEM and reseller deals, with the bulk of the revenue generated from Clariion midrange SAN sales. Dell will also no longer manufacture EMC’s low-end Clariion.

Dell’s revenue sales for EMC’s channel have been sliding downward since last year, Gelsinger said. EMC reported Dell-generated revenue of $55 million in the fourth quarter of 2010, and that fell to under $40 million in the first quarter of this year. EMC has not given a figure for Dell revenue since then, but its executives said its non-Dell channel sales for the mid-tier increased 44% year-over-year in the third quarter of this year.

EMC has built up its channel this year, making the SMB VNXe product a channel-only offering that directly competes with Dell products. Earlier this month, EMC launched a channel-only Data Domain DD160 SMB system.

EMC has also continued to upgrade its VNX midrange platform. Last week it launched an all-flash model (VNX5500-F) as well as a high-bandwidth VNX5500 option with four extra 6 Gbps SAS ports, and support for 3 TB SAS drives throughout the VNX family.

“Now that we are no longer continuing forward [with Dell], we have to do it ourselves,” Gelsinger said. “It’s a clear, simple focus on our part.”

Dell began selling EMC storage in 2001, and in late 2008 the vendors said they were extending their OEM agreement through 2013. Dell also widened the deal in March 2010 by adding EMC Celerra NAS and Data Domain deduplication backup appliances to their OEM arrangement. However, the relationship had already started to deteriorate by then, going back to when Dell acquired  EqualLogic in early 2008.

The rift became irreparable last year when Dell followed an unsuccessful bid for 3PAR by completing an $820 million acquisition of Compellent in December.

October 19, 2011  4:15 PM

Symantec expands FileStore’s dedupe, DR features

Brein Matturro Profile: Brein Matturro

By Sonia R. Lelii, Senior News Writer

Symantec Corp. today upgraded its FileStore N8300 clustered network-attached storage (NAS) appliance, adding deduplication for primary storage, metro-clustering and cascading replication, and cloning of VMware images.

FileStore N8300  5.7 now can be used as a storage target in virtual machine environments, where a file-level cloning feature is used to create a golden image and users can clone that image into thousands of VMDK files, said Yogesh Agrawal, Symantec’s vice president and general manager for the FileStore Product Group. Symantec also leveraged code from Veritas Cluster File System  to create a deduplication module in the FileStore appliance to reduce redundancies in the VMDK files.

The NAS device also now supports metro-clustering replication for disaster recovery that automates the process of bringing up the disaster recovery site when the primary site goes down. Previously, the disaster recovery site had to be made live manually.  Metro-clustering is based on synchronous volume mirroring and the cluster limitation is 100 kilometers. Cascading replication now allows replication to be done from a secondary to a tertiary site. “In that scenario, we can do synchronous replication,” Agrawal said.

FileStore starts with about 10 TB of capacity and can scale to 1.4 PB. A common customer configuration is a two-node, 24 TB system that has a list price of $69,796.

October 17, 2011  10:20 PM

Dell pulls the plug on EMC relationship

Dave Raffo Dave Raffo Profile: Dave Raffo

Dell today officially ended its 10-year partnership with EMC, saying it would no longer sell EMC products after making a series of storage acquisitions over the past four years.

Customers who purchased EMC storage from Dell will continue to receive support, Dell said in a statement, but it is ending its OEM and reseller deals for EMC Clariion, Celerra, Data Domain and VNX systems.

The move isn’t much of a surprise, considering Dell had already driven a large wedge into the relationship by buying its own storage companies – including several direct competitors to EMC.

Dell has sold EMC storage since 2001, and in late 2008 the vendors said they were extending their OEM agreement through 2013. Dell also widened the deal in March of 2010 by adding EMC Celerra NAS and Data Domain deduplication backup appliances to their OEM arrangement. However, the relationship had already started to deteriorate by then, going back to when Dell acquired EMC competitor EqualLogic in early 2008. The rift became irreparable last year when Dell followed an unsuccessful bid for 3PAR by completing an $820 million acquisition of Compellent in December.

Dell also acquired data reduction vendor Ocarina and the assets of scale-out NAS vendor Exanet in 2010, giving it more storage IP to integrate with its platforms.

Even before Dell bought Compellent, EMC CEO Joe Tucci a year ago said the once tight relationship between the vendors “cooled off” after Dell tried to buy 3PAR. A Dell spokesman responded by saying EMC still played an important role in Dell’s storage strategy.

There has been no OEM deal for EMC’s VNX unified storage system launched last January, although Dell did have an OEM deal for the Clariion and Celerra platforms that VNX replaced. EMC has built up its channel this year, making the SMB VNXe product a channel-only offering that directly competes with Dell products. Last week EMC launched a channel-only Data Domain DD160 SMB system.

October 17, 2011  12:56 PM

Storage tiering and caching bring different values, costs

Randy Kerns Randy Kerns Profile: Randy Kerns

We hear a lot these days about tiering and caching in storage systems. These are not the same thing. Some systems implement tiering across types of media, while others cache data into a solid-state device as transient storage. Other storage systems have both capabilities.

IT professionals may wonder what the differences between tiering and caching are, and whether they need to tier or cache data. There are clear differences, but the performance implications between the approaches vary primarily based on the specific storage system implementation.

Tiering storage systems use different devices such as solid-state devices, high-performance disks, and high-capacity disks. Each of these device types make up a tier. The systems intelligently move data between the tiers based on patterns of access — a process known as automated tiering.

Tiering greatly increases the overall system performance, with access to the most active data coming from the highest performance devices. The higher performance allows the systems to support more demanding applications. Tiering also lets an organization get by with smaller amounts of the most expensive types of storage by moving less frequently accessed data to cheaper drives.

Caching systems use memory or solid state to store highly active data as transient data that may be accessed from the higher performing technology. The caching data resides in a permanent location in the storage system in addition to the cache.

Caching may be done in RAM or in solid-state devices used specifically for caching. RAM cache can be protected by a battery or capacitor.

Caching has been used effectively to speed storage performance for many years. In the mainframe world, the caching is controlled with information communicated from the operating system. In open systems, the storage systems contain the intelligence to stage or leave copies of active data in the cache. Storage systems can cache read data only, or they can also accelerate writes.

If a storage system features tiering and caching, the features need to work in concert to avoid wasted or conflicting data movement. There can be improved performance if the two capabilities work together.

IT professionals need to consider the cost/benefit tradeoffs of tiering and caching. What performance is gained versus the cost? The overall performance benefit needs to be considered in the context of the workload from the applications that use the stored information. Most of the vendors of tiered storage systems have effective tools that analyze the environment and report on the effectiveness of tiering. This is necessary to optimize performance.

There is no easy answer to the choice of tiering, caching, or doing both in a storage system. It becomes a matter of maximizing the performance capabilities of the storage system and what value it brings in consolidation, reduced costs, and overall efficiency gains. An analysis of the value gained versus the cost must be done for any individual system.

(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).

October 14, 2011  1:16 PM

FCoE still lacking support

Brein Matturro Profile: Brein Matturro

By Sonia R. Lelii, Senior News Writer

Brocade showcased its 1860 Fabric Adaptor at Storage Networking World (SNW) in Orlando, Fla., this week, which gives customers the option to implement 16 Gbps Fibre Channel, 10 Gigabit Ethernet (10 GbE) or Fibre Channel over Ethernet (FCoE) connectivity. The company describes the adapter as “any I/O.” But Brocade product marketing manager James. D. Myers doesn’t see many companies  implementing FCoE so far.

“There isn’t a lot of adoption yet,” Myers said. “They are buying a lot of converged networks but they are not turning (FCoE) on yet. There are a few early adoptors. Most are hedging their bets.  I think it will take upwards of a decade for FCoE to be prevalent.”

Brocade hasn’t been a huge advocate for FCoE the way its rival Cisco Systems has been. But at least one SNW attendee confirms Myers’ thoughts.  Mitchel Weinberger, IT manager for the Seattle-based GeoEngineers,  said he researched FCoE and found the performance gain wasn’t significant enough to introduce a new technology into his infrastucture. The company uses an iSCSI SAN from Compellent that connects 10 GbE switches to virtual servers.

“We don’t see the benefit,” Weinberger said. “All the studies I’ve seen say the benefits are minimal. We really didn’t see enough advantage to put Fibre Channel over Ethernet. It’s another technology for us to learn, and we don’t have the staff.”

FCoE basically encapsulates Fibre Channel frames over Ethernet networks, and the benefits includes the reduction of I/O adapters, cables and switches in the data center. But the convergence of Fibre Channel and Ethernet means storage and network administrators must share management responsibilities, or one team must cede control to the other. That can be a big problem in organizations where the two groups don’t get along.

“It makes total sense,” said Howard Marks, chief scientist at “Except for the politics.”

October 13, 2011  7:00 PM

Sepaton sets its sights on Big Data

Brein Matturro Profile: Brein Matturro

By Sonia R. Lelii, Senior News Writer

Sepaton is looking to move beyond its data protection specialty into Big Data.

At Storage Networking World in Orlando, Fla., this week, new Sepaton CTO Jeffrey Tofano offered a broad description of where the vendor plans to go within the next five years. Tofano didn’t offer too many technology specifics, but said Sepaton’s plan is to position itself for the broader Big Data market.

The idea is to expand its use of NAS protocols to its backup products within the next year, then over the subsequent two years provide “solution stacks” for snapshot archiving, specialized archiving and data protection environments.The goal is to use all of that technology for nearline storage and big data. “Our technology is skating where the buck will be, and that’s Big Data,” said Tofano, who was previously CTO of Quantum Corp.

There still is a lot of marketing hype around the term Big Data, but Tofano puts it into two buckets: either large data sets or analytics of petabytes of data for business intelligence. Sepaton will be targeting both, he said, by using the company’s technology to “bring specialized processors closer to the storage to do clickstreaming, web logs or email logs.

“It turns out that Big Data is a perfect fit for our [technology] core, which is a scalable grid, content-aware deduplication and replication technology,” Tofano said. “Our technology is not the limiting factor. We have a lot of the pieces in place. We are not building a new box. We are refining a box to get into the Big Data market. Right now, we have a scalable repository bundled behind a Virtual Tape Library [VTL] personality.”

Tofano said the VTL market is mature, and this new direction does not mean Sepaton will get out of the  backup space. Obviously, he said, it depends on revenues. “We will become more general purpose over time. We will do storage and support loads outside of data protection,” Tofano said.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: