The term archiving can be used in different contexts. Its use across vertical markets and in practice leads to confusion and communication problems. Working on strategy projects with IT clients has led me to always clarify what archive means in their environments. To help this out, here are a few basics about what we mean when we say “archive.”
Archive is a verb and a noun. We’ll deal with the noun first and discuss what an archive means depending on the perspective of the particular industry.
In the traditional IT space such as commercial business processing, etc., an archive is where information is moved that is not normally required in day-to-day processing activities. The archive is a storage location for the information and typically seen as either an online archive or a deep archive.
An online archive is where data is moved from primary storage that can be seamlessly and directly accessed by the applications or users without involving IT or running additional software processes. This means the information is seen in the context in which the user or application would expect. The online archive is usually protected with replication to another archive system separate from the backup process. The size of an online archive can be capped by moving information based on criteria to a deep archive.
A deep archive is for storing information that is not expected to be needed again but cannot be deleted. While it is expected to be much less expensive to store information there, accessing the information may require more time than the user would normally tolerate. Moving data to the deep archive is one of the key areas of differentiation. Some online archives can have criteria set to automatically and transparently move data to the deep archive while others may require separate software to make the decisions and perform the actions.
In healthcare, information such as radiological images is initially stored in an archive (which translates to primary storage for those in the traditional IT space). Usually as images are stored in the archive, a copy is made in a deep archive as the initial protected copy. The deep archive will be replicated as a protected copy. Based on policies, the copy in the archive may be discarded after a period of time (in many cases, this may be one year) with the copies on the deep archive still remaining. Access to the copy on the deep archive is done by a promotion of a copy to the archive in the case of a scheduled patient visit or by a demand for access due to an unplanned visit or consultative search.
For media and entertainment, the archive is the repository of content representing an asset such as movie clips. The archive in this case may have different requirements than a traditional IT archive because of the performance demands on access and the information value requirements for integrity validation and for the longevity of retention, which could be forever. Discussing the needs of an archive in this context is really about an online repository with specific demands on access and protection.
As a verb, archive is about moving information to the physical archive system. This may be the actual application that stores the information in the archive. An example of this would be a Picture Archiving and Communications System (PACS) or Radiology Information System (RIS) system in healthcare. In other businesses, third-party software may move the information to the archive. In the traditional IT space, this could be a solution such as Symantec Enterprise Vault that could move files or emails to an archive target based on administrator set criteria.
As archiving attracts more interest because of the economic savings it provides, there will be additional confusion added with solution variations. It will always require a bit more explanation to draw an accurate picture.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Startup Nimble Storage is taking a page out of NetApp’s playbook with its private cloud reference architecture put together with Cicso and Microsoft. And it is going beyond other storage vendors’ monitoring and analytics capabilities with its InfoSight services.
This week Nimble launched its SmartStack for Microsoft Windows Server and System Center reference architecture. It ncludes a three-rack unit of Nimble’s CS200 hybrid storage, Cisco UCS C-Series rackmount servers and Windows Server 2012 with Hyper-V and Microsoft Systems Center 2012. The reference architecture is designed to speed deployment of private clouds with up to 72 Hyper-V virtual machines.
Last October, Nimble rolled out a reference architecture for virtual desktop infrastructure (VDI) with Cisco and VMware.
The reference architecture model is similar to that of NetApp’s FlexPod, which also uses Cisco servers and networking. NetApp has FlexPod architectures for Microsoft and VMware’s hypervisors. EMC added Vspex reference architectures last year, two years after NetApp launched FlexPods.
Nimble’s InfoSight appears ahead of other storage vendors’ analytics services. It goes beyond “phone-home” features to collect performance, capacity, data protection and system health information for proactive maintenance. Customers can access the information on their systems through an InfoSight cloud portal.
What makes InfoSight stand out is the depth of the information amassed. Nimble claims it collects more than 30 million sensor values per array per day, grabbing data every five minutes. It can find problems such as bad NICs and cables, make cache and CPU sizing recommendations and give customers an idea of what type of performance they can expect from specific application workloads.
“Nimble collects a much larger amount of data than is traditionally done in the industry,” said Arun Taneja, consulting analyst for the Taneja Group. “Traditionally, an array would grab something from a log file at the end of the day. These guys are grabbing 30 million data points. Then they return that information proactively to users in the form of best practices and provide proactive alerts about product issues. I think everybody will end up there, but it might take five years. “
The National Association of Broadcasters (NAB) conference has become a big focus for storage vendors. The growth in media content and the increased resolution of recordings make for a fast growing market for storage demand. And, the data is not thrown away (deleted). Media and entertainment (M&E) industry data is primarily file-based with a defined workflow using files of media in a variety of formats.
The large amount of content favors storage archiving solutions to work with media asset management for repositories of content. But, these archives are different than those used in traditional IT. The information in M&E archives is expected to be retrieved frequently and the performance of the retrieval is important. For rendering operations, high performance storage is necessary and the sharing capabilities for the post-production processes determine product usability.
Evaluator Group met with a number of storage vendors at this month’s NAB conference. Below are some of the highlights from a few of those meetings.
• For tape vendor Spectra Logic, Hossein Ziashakeri the VP of Business Development talked about changes in the media and entertainment market and Spectra Logic. He said media and entertainment is becoming more of an IT environment. Software is driving this, particularly automation tools. And the new generation of people in media and entertainment are more IT savvy than in the past. M&E challenges include the amount of content being generated. The need to keep everything is driving an overwhelming storage demand. The cost and speed of file retrieval are major concerns. Spectra Logic is a player because the M&E market has a long history with tape, which has become more of an archiving play than a backup play.
• Mike Davis, Dell’s director of marketing and strategy for file systems, said Dell’s M&E play is primarily file-based around its Compellent FS8600 scale-out NAS. Davis said M&E customers also use Dell’s Ocarina data reduction, which allowed one customer to reduce 3 PB of data. The FS8600 now supports eight nodes and 2 PB in a single system.
• Quantum has had a long term presence in the media and entertainment market with StorNext widely deployed for file management and scaling. StorNext product marketing manager Janet Lafleur said Quantum will announce its Lattus-M object storage system integrated with StorNext in May. Quantum’s current Lattus-X system supports CIFS and NFS along with objects. Quantum also has a StorNext AEL appliance that includes tape for file archiving.
• Hitachi Data Systems (HDS) had a major presence at NAB with several products on display, including Hitachi Unified Storage (HUS) storage, HNAS and Hitachi Content Platform (HCP) archiving systems. Ravi Chalaka, VP of solutions marketing, Jeff Greenwald, senior solutions marketing manager, and Jason Hardy, senior solutions consultant spoke on HDS media and entertainment initiatives. HDS is looking at solid state drives (SSDs) to improve streaming and post-production work. HNAS to Amazon S3 cloud connectivity has been available for two months, and HDS has a relationship with Crossroads to send data from HCP to Crossroads’ StrongBox LTFS appliances.
• StorageDNA CEO Tridib Chakravrty, CEO and director of marketing Rebecca Greenwell spoke about the capabilities of their company’s data movement engine. StorageDNA’s DNA Evolution includes a parallel file system built from LTFS that extracts information into XML for searching. StorageDNA technology works with most media asset management software now. The vendor plans to add S3 cloud connectivity.
• Dot Hill sells several storage arrays into M&E market through partnerships, including its OEM deal to provide build Hewlett-Packard’s MSA P2000 system. Jim Jonez, Dot Hill’s senior director of marketing, said the vendor has several partners in the post-production market.
• CloudSigma is a cloud services provider that uses solid state storage to provide services for customers such as content product software vendor Gorilla Technology. CloudSigma CEO Robert Jenkins said the provider hosts clouds in Zurich and Las Vegas built on 1U servers with four SSDs in each. The SSDs solve the problem of dealing with all random I/Os. He said CloudSigma plans to add object storage through a partnership with Scality, which will provide geo-replication.
• Signiant sells file sharing and file movement software into the M&E market. Doug Cahill, Signiant’s VP of business development, said his vendor supports the new Framework for Interoperable Media Services (FIMS) standard and recently added a Dropbox-like interface for end users. Signiant’s software works as a browser plug-in to separate the control path from the data path.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
The massive amount of unstructured data being created has vendors pushing to deliver object storage systems.
There are many object systems available now from new and established vendors, and others are privately talking about bringing out new object systems soon.
Objects, in the context of the new generation of object storage systems, are viewed as unstructured data elements (think files) with additional metadata. The additional metadata carries information such as the required data protection, longevity, access control and notification, compliance requirements, original application creation information, and so on. New applications may directly write a new form of objects and metadata but the current model is that of files with added metadata. Billions of files. Probably more than traditional file systems can handle.
Looking at the available object storage systems leads to the conclusion that these systems are not developed to meet the real IT needs. Vendors are addressing the issue of storing massive number of objects (and selling lots of storage), but the real problem is about organizing the information. File systems usually depend on users and applications to define the structure of information as they store the information. This is usually done in a hierarchical structure that is viewed through applications, the most ubiquitous being Windows Explorer.
We need a way to make it easier to organize the information according to a different set of criteria, such as the type of application, user (person viewing the information) needs, age of information, or other selectable information. The management should include controls for protection and selectivity for user restores of previously protected copies of information. Other information management should be available at the control view rather than through management interfaces of other applications. This seems only natural but it has not turned out this way.
Vendor marketing takes advantage of opportunities to ride a wave of customer interest. Vendors will characterize some earlier developed product as an object file system just as today almost everything that exists is being called “software-defined something.” But the solution for managing the dramatic growth of unstructured data must be developed specifically to address those needs and include characteristics to advance management of information as well as storage.
The investment in addressing object management needs to be made, otherwise, the object storage systems will be incomplete. Linking the managing of information and the object storage systems seems like a major advantage for customers. This will be an interesting area to watch develop.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Silver Peak Systems Inc. is building out its Virtual Acceleration Open Architecture (VXOA) that allows storage administrators to bypass network administrators when they need to improve application performance through WAN acceleration.
The company announced Web-based downloadable software products aimed at increasing accelerating offsite data replication workloads. The SilverPeak VRX-2, VRX-4 and VRX-8 software are virtual WAN-optimizing products that support VMware vSphere, Microsoft Hyper-V, Citrix Xen and KVM hypervisors. The virtual WAN optimization software is compatible with IP-based array replication software from Dell, EMC, IBM, Hitachi Data Systems, Hewlett-Packard and NetApp.
SilverPeak VRX-2 can handle up to per replication throughput per hour, while the VRX-4 can handle 400 GBs per replication throughput per hour and the VRX-8 handles up to 1.5 TB per replication throughput per hour. Annual licenses for each cost $2,764, $8,297 and $38,731, respectively.
Silver Peak CEO Rick Tinsley said the VRX-8 is positioned more for large deployments such as EMC’s EMC Symmetrix Remote Data Facility (SRDF) asynchronous product, RecoveryPoint and EMC DataDomain backup. The small VRX versions are tailored more for Dell EqualLogic replication.
In December 2012, Silver Peak Systems brought out Virtual Acceleration Open Architecture 6.0 WAN optimization software with expanded support for virtualization hypervisors. The WAN acceleration software, which operates on Silver Peak’s NX physical and VRX virtual appliances are part of the company’s strategy to give storage administrators the ability to more efficiently improve application performance, reduce bandwidth costs without involving network administrators to re-configure network switches and routers.
“Back in December, we did make enhancements to our software that made it easier for storage managers to deploy our technology, which we call our Velocity initiative, but it was not productized specifically for storage managers at that time,” according to a SilverPeak spokesperson. “This is the next phase and culmination of those Velocity developments, where these new VRX software products are uniquely priced and positioned with the storage managers in mind by addressing storage concerns such as ‘shrinking RPOs’ and how many terabytes-per-hour can be moved to an offsite location.”
In March, Silver Peak announced its Virtual Acceleration Open Architecture (VXOA) software can be used for WAN optimization in Amazon cloud deployments for off-site replication and lower disaster recovery costs.
A recent conversation I had about the cost of storage made me think that talking about the cost of storage is the wrong way to approach it. The discussion should be about the value that storage delivers.
Trying to explain the complex nature of meeting specific demands for storing and retrieving information and advanced features for management and access is difficult when discussing it with someone who is focused only on how much it costs to store the information.
When storage costs, there is an implicit assumption that all factors are equal in storing and retrieving information. But several factors should take priority:
• How fast must the information be stored and retrieved? The ingestion rate (how fast data arrives) and how long it takes for the data to be protected on non-volatile media with the required number of copies has a big impact on applications and potential risk. Retrieving information is about how fast the data can be accessed (latency) and the amount of IOPS or continuous transfer (bandwidth) that can be sustained.
• What type of protection and integrity are required? Information has different value and the value changes over time. Information protection may be as simple as a single copy on non-volatile storage or as complex as multiple copies with geographical dispersion. Integrity is another concern. Protection from external forces so the loss of one or more bits of data can be detected and corrected is highly valuable and often assumed without understanding what is involved. Additional periodic integrity checking is another assurance for the information. It also answers the question posed for many in IT: “How do you know that is the same data that was written?”
• The longevity of the information can have a major influence on storing and retrieving. A significant percentage of information is kept more than 10 years. Compliance requirements dictate the length of time and manner of control of information in regulated industries. Storing information on devices that have limited lifespans (such as when you can no longer purchase a new device to retrieve information), means that other considerations must be made. If the information can be transparently and non-disruptively migrated to new technology without additional administrative effort or cost, that should be a factor in the selection process.
Here’s an example of how this works with a real IT operation that needed to increase its transactions per second. Increasing the number of transactions allowed the organization to get more done over a period of time, expand its business and provide better customer service. In this case, more capacity was not the issue – the capacity for the transaction processing was modest. After evaluating where the limitations were, it was clear that adding non-volatile solid state technology for the primary database met and even exceeded the demands for acceleration. Storage selection was not based on the cost as function of capacity ($/GB). It was based on the value returned in improving the transaction processing and gaining more value from the investments in applications and other infrastructure elements.
Storage must be evaluated on the value it brings in the usage model required. Comparing costs as a function of capacity can make for bad judgments or bad advice.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
With large established vendors planning to launch all-flash storage arrays, startup Pure Storage is offering a money-back guarantee to customers who want to try their systems now.
Pure calls the promotion “Love your storage” and is telling customers they can return their FlashArray for a full refund within 30 days if they’re not happy for any reason.
Matt Kixmoeller, Pure’s VP of product management, said Pure’s guarantee is different than other vendors who offer guarantees if certain performance conditions are not meant. He said Pure will cancel the sale unconditionally, as long as the array isn’t damaged.
“If a customer doesn’t love Pure Storage, we don’t deserve to have their money,” he said. “We don’t define love, we let the customer define love. If they’re not happy for any reason, all they have to do is raise their hands and we will return their money.”
Pure and a few other startups such as Nimbus Data, Violin Memory, Whiptail and Kaminario have had the all-flash array market to themselves for the past year or so. But that is changing. IBM already has Texas Memory Systems, EMC is preparing to make its XtremIO arrays generally available in a few months and NetApp has pre-announced its FlashRay that won’t go GA until 2014. Also, Hitachi Data Systems is working on an all-flash array and Hewlett-Packard is making its 3PAR StoreServ arrays available with all flash.
But it was likely the recent announcements that NetApp and EMC made that spurred Pure to its money-back offer. NetApp and EMC wanted to make it clear they will enter the market, which could prompt some of Pure’s would-be customers to wait.
“The reason they did pre-announcements was they want to freeze the market, but customers are smarter than that,” Kixmoeller said. “We suggest customers get one of ours and they try it out.”
Besides fending off vendors that don’t have their products out yet, Pure and the other startups find their potential customers wondering about their long-term fate. The all-flash startups are well funded, but a lot of people in the industry are waiting for the next acquisition. Hybrid flash startup Starboard Storage has publicly admitted it is for sale, as Texas Memory did before IBM acquired it. But Pure execs say they are committed to staying independent.
CEO Scott Dietzen hears so many acquisition questions that he wrote a blog this week claiming he refused to even discuss deals with large companies who have approached him, and has no intention of selling.
“As more companies get acquired, we get more customers asking what our long-term future is,” Kixmoeller said. “We’re committed to growing our company.”
Fusion-io CEO David Flynn said Linux and open source have emerged as the keys to software development for flash, and that is why his company this week acquired U.K.-based ID7.
ID7 developed the open source SCSI Target Subsystem (SCST)for Linux. SCST is a SCSI target subsystem that allows companies to turn any Linux box into a storage device. It links storage to the system’s SCSI drivers through Fibre Channel, iSCSI, Ethernet, SAS, Fibre Channel over Ethernet and InfiniBand to provide replication, think provisioning, deduplication, automatic backup and other storage functions.
Fusion-io already licenses SCST for its ION Data Accelerator virtual appliance that turns servers into all-flash storage devices. But the ID7 acquisition gives Fusion-io greater control of the SCST technology, as well as the engineers who developed it.
Flynn said Linux is the crucial operating system for flash developers, and SCST is used by most vendors who build flash storage systems.
“Linux is the new storage platform and open is the new storage architecture,” Flynn said. “Anybody building a flash memory appliance is using Linux. We believe software-defined storage systems are the future, Linux is the foundation of that, and we have accumulated many key Linux kernel contributors.”
Flynn won’t say how much Fusion-io paid for ID7 or even how many engineers it will add from the acquisition. He did say he is committed to honoring ID7’s license deals, maintaining an open source version of SCST and contributing to the open source distribution.
“We believe in open systems,” he said. “We will continue to support the industry, competitors included. But our only real competitor is EMC.”
Flynn said he expects Fusion-io to be an active acquirer of flash technology that it does not develop internally, such as the caching software it gained by buying startup IO Turbine for $95 million in 2011.
“Flash changes the game in a lot of ways,” Flynn said. “The industry is growing so quickly it would be silly to presume we can build everything internally.”
Starboard has slashed sales and marketing staff, and notified its reseller partners that it would concentrate on developing its intellectual property instead of sales until it finds a buyer or OEM partner.
Tom Major, who joined Starboard as president in January, told StorageSoup the new strategy came after the company went looking for funding. He said Starboard ’s investors, venture capitalists Grazia Equity GmbH and JP Ventures GmbH, were approached by strategic partners and decided to explore an acquisition. He said Grazia and JP Ventures have invested more money into Starboard to fund the transition period.
“We received interest from outside companies,” Major said. “Then we thought, ‘Who else might be interested?’ And that list gets long.”
Major said Starboard has received “more than one, but less than five” inquiries from suitors. The board will also pursue others in the industry. “I wouldn’t say a deal is imminent, but we are having conversations,” he said. “The board has decided to focus on technology and continue to develop it. We still have a small number of sales and marking resources, but we’re not actively seeking resellers and VARS now. We are aggressively talking to companies that could take the technology to market through an acquisition or licensing arrangement.”
He said potential suitors include established storage vendors and others looking to get into storage, particularly solid-state storage.
All of Starboard’s AC Series of multiprotocol arrays use solid state drives and DRAM to accelerate reads and writes. Lee Johns, Starboard’s VP of product management, said the vendor will upgrade its operating system over the next few months with enhanced caching algorithms, multiple write caches and the ability to compress data on the cache.
“Our IP is in being able to effectively leverage high speed and lower speed media together,” Johns said.
Starboard built on unified storage technology sold by Reldata, and several Reldata executives – including CEO Victor Walker and CTO Kirill Malkin – were part of the original Starboard team in February 2012. But the current Starboard team has a strong influence of former LeftHand Networks execs, including Major and CEO Bill Chambers. Johns also worked with LeftHand technology as director of product marketing for Hewlett-Packard after HP acquired iSCSI SAN vendor LeftHand.
InfraScale, Inc. is gunning for Dropbox. The newcomer is offering organizations a year’s worth of free online file sharing service for IT administrators who are willing to drop their Dropbox service.
InfraScale will give free FileLocker accounts with 100 Gigabytes of storage per user to Dropbox customers with between 250 and 500 employees. Dropbox is the leader in this crowded space, and InfraScale’s FileLocker is trying to set itself apart from the pack by emphasizing how rogue online file sharing accounts — also called shadow IT — presents a security risk for companies.
“Dropbox says it has 95 percent of the organizations in the U.S.,” said Sheilin Herrick, InfraScale’s director of marketing. “So this is primarily for IT administrators that want to drop Dropbox.”
The offer is good until April 30.
Dropbox moved to strengthen its security features in the latest version of its business-focused Dropbox for Teams service released last month.
InfraScale has focused on security from the start with FileLocker, which launched in November 2012. FileLocker has a three-tier security model, in which the service is installed behind the company’s firewall for private cloud deployments. It also secures data in transit with 256-Bit SSL encryption connection and 256-Bit AES encryption for data at rest.
“We want to help IT managers deal with shadow IT,” said Stephen Gold, InfraScale’s director of business development.”
This service allows IT managers to control permissions, set up bulk accounts, delete files and accounts other centralized controls. Fueled by the BYOD movement, many employees have started to deploy online file sharing products like Dropbox as a way to synchronize data with their mobile devices.
“But rogue accounts represent a serious security and compliance risk to organizations. When end-users store company files in the OFS provider’s data center in a public cloud, the files are placed outside the reach of the organization’s privacy policies and security controls,” according to an Enterprise Strategy Group report titled “Spotting and Stopping Rogue Online File Sharing.”