There are many opinions regarding how to handle information storage for big data analytics. By big data analytics, I’m referring to information associated with a data analytics operation that does the analysis in near real-time to present immediately actionable results. The most common approach to this type of analysis is to provide data that is the source for the real-time analytics process to the compute nodes with minimal latency and at a high data rate.
This requirement has led many data scientists designing analytics systems to require data to come from storage directly attached to the compute nodes. If solid state devices (SSDs) are used for storage, then all the better. This is contrary to most IT organizations’ strategy of delivering efficient storage utilization through networked storage. The approaches for the source data will continue to evolve with new storage systems and methods, but currently the decisions are driven by the designers of the analytics systems.
A more impacting question, is where does the data go after the initial analysis has been done? Some say that the data has already been used and can be discarded. However, future analysis on a larger set of data with different criteria may prove valuable. The problem is where to store that potentially massive amount of data that might be used again.
The most discussed approach is to archive the data for subsequent usage. The target for the data could be:
• A local storage system as a content repository. Usually this would be a NAS system for the unstructured file content used in data analytics, but it could also be a new generation object storage system capable of handling potentially billions of objects.
• Cloud storage may be the target for the analyzed data either as files or objects. With cloud storage, the storage costs could be reduced compared to adding infrastructure and archiving storage systems in IT for what may be a highly varying amount of capacity required. The costs are dependent on the amount of time the data is retained.
Ultimately this could be a massive amount of data. Archiving storage systems are typically self-protecting with remote replication to another archiving system or to cloud storage. The requirement for data protection may be another variable depending on the value of the data.
The big in big data analytics can mean big money if the decisions about where to store the information and how long to retain it are not strategically made. The main focus for big data analytics so far has been on the speed of the initial data analysis. Where to put the data to be retained must be considered as well and this can be a major concern for IT.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Nexenta Systems, which sells storage systems based on ZFS technology, revamped its leadership team and pulled in $24 million in funding today with an eye on going public.
Mark Lockareff takes over as CEO from Evan Powell, who is shifting to chief strategy officer. Nexenta also hired Bridget Warwick – formerly at BlueArc and NetApp – as chief marketing officer.
The Santa Clara, Calif.-based company has raised a total of $55 million in funding, including a $21 million round last year. The latest funding is Nexenta’s D round.
Lockareff comes to Nexenta from Bridge Adivsory Partners, where he served as managing director. He said he will focus on driving Nexenta’s next stage of growth as a software-defined storage vendor. The company’s core product is NexentaStor, which is based on open-source ZFS technology. The software runs on commodity servers, turning them into multiprotocol storage systems.
“There are a lot of different directions our product can get pulled into, so we have to be disciplined in the direction,” Locareff said. “We have the two hardest parts underneath us now [building a product and generating revenues]. Now it’s time to build a management team and the infrastructure for growth. We are moving to become a public company someday.”
Lockareff said the $24 million will be used to build out its field engagement to work with partners and joint marketing efforts. It also will be used to build out core features in the product and product a road map for resellers. Nexenta is working on getting its software to run on SSDs.
“There is an array of SSD providers and each might have different approaches in configurations,” Lockareff said. “Also, a lot of plug-in players want to work with us.”
Nexenta’s latest financing is led by new investor Four Rivers Group, with participation by previous Nexenta investors Menlo Ventures, TransLink Capital, Javelin Ventures, Sierra Ventures, Razor’s Edge Ventures, and West Summit Capital. In addition to Four Rivers, Presidio Ventures and UMC Capital participated in the funding.
Sykera, preparing to make its SkyHawk cheap all-flash storage arrays generally available in a few months, has $51.6 million in funding to market that platform and fuel development on its next system.
Skyera this week said it closed a mega-B Round led by Dell Ventures with other strategic partners participating. The round was actually $45.6 million, with Skyera’s $6 million seed money included in the $51.6 million figure.
Skyera came out of stealth last August, claiming it can sell all-flash storage at less than $3 per gigabyte. That would make the systems about the same price as spinning disk arrays. SkyHawk arrays have been in limited production through its beta program.
Tony Barbagallo, Skyera’s VP of marketing, said the startup is working on the next generation that will include more enterprise features such as active-active controllers, Fibre Channel support, high availability and the ability to scale up and out.
He said the first gen systems have complete solid state and storage management software such as LUN management, thin provisioning, read-only and writeable snapshots and encryption.
“This is disruptive technology, which is why Dell was excited,” Barbagallo said.
He said Skyera clears the biggest obstacle to flash adoption.
“There is one reason – and one reason only – why people aren’t dumping hard drive storage and moving to flash. That’s the cost of flash,” he said. “A number of vendors have picked fringe areas relative to the storage market to sell flash into. VDI, Hadoop data clusters and anything high performance computing are fringe areas of the mainstream market. They need the performance only flash can provide, and they’re willing to pay thirty dollars a gig to get it.
“[Skyera CEO] Rado [Danilak] said we need a way to break that price barrier, and that’s been our strategy from the start.”
The all-flash market is dominated by well-funded startups, but that will change over the next year or so. EMC is expected to release its “Project X” array from technology acquired from XtremeIO this year and NetApp this week said it will have a freshly designed FlashRay system in 2014 to go with its EF540 all-flash array for high performance computing.
Dell is the one major vendor without a fully fleshed-out flash strategy. There is no agreement for Dell to use or sell Skyera technology, but its investment gives it a say in development.
Skyera’s funding release included a quote from Marius Haas, president of Dell’s enterprise solutions group, praising the startup for its “innovative technology that is breaking new ground in enterprise solid-state storage systems, including controllers, memory and software.”
Dell is the only named Skyera investor. Barbagallo said all of the investors are strategic and there is no traditional venture capitalist money behind the startup.
You’ve probably heard of software-defined networking by now. The next step in storage, according to startup Jeda Networks, is software-defined storage networking.
Jeda came out of stealth this week with Fabric Network Controller (FNC) software, which it describes as intelligent software that installs on a hypervisor and gives standard Ethernet switches the ability to run the most powerful SANs. Jeda software runs with adapters from Intel and Broadcom, Berman said. The software virtualizes the connection between servers and storage. FCN supports Fibre Channel and Fibre Channel over Ethernet (FCoE) protocols.
The idea is to remove the need for dedicated storage switches, which greatly reduces cost and fits in with the current converged infrastructure strategy pursued by many companies.
“Storage networks are too expensive and too complex, and they don’t scale,” said Jeda CEO Stuart Berman, a veteran of storage networking companies Emulex and Vixel.
Berman said his software installs as a virtual machine on VMware ESX, and will eventually run on hypervisors from Microsoft, Red Hat, Citrix and others.
“We’ve virtualized the way servers talk to storage,” he said. “People will see the virtual machine that they install on their VMware server. We talk to switches and adapters in storage and servers. We can be an application on top of [VMware-owned SDN play] Nicira or Big Switch.”
Can Jeda’s software match Fibre Channel’s low latency? That is a requirement to make it in storage networking. We should find out soon enough. Berman said Jeda has two unannounced OEM wins, and he hopes to see his software show up in shipping products around the middle of the year.
He intends to set up a VAR program as well, but this type of software seems best suited to OEM distribution.
Barracuda Networks has jumped into the crowded online file sharing pond, and today enlisted Drobo as a partner to help get started.
Barracuda’s Copy cloud file sharing service launched as a private beta last year, and will enter public beta this year. The security/data protection vendor will offer customers of Drobo’s new 5N SMB/prosumer NAS box 5GB of free cloud file storage on Copy. Drobo customers can license additional capacity from Barracuda.
Besides seeding its cloud with potential customers, Barracuda general manager Guy Suter said the partnership can make for a smoother interaction between on-premise and cloud storage.
“To us, the Drobo looks like another device that we synch files to,” he said. “Having local storage and cloud storage interact with each other seamlessly helps your workflow a lot.”
For Drobo, the deal gives its customers a quick way to use the cloud as a complement to the storage inside the box. Erik Pounds, Drobo VP of product management, said he expects customers to embrace the cloud even after they buy on-premise storage. The cloud can serve as backup of critical files.
“A lot of data stored in remote or home offices is inhibited by the four walls of that home or office,” he said. “We’re not afraid of the cloud because the amount of data that needs to be stored and shared is massive. The average data on Drobo storage is 3 TB, so there’s a lot of desire to use both.”
Copy is also available as a standalone service, but Barracuda can use the help from Drobo in making its way among dozens of competitors already in the market, including Dropbox, Box and EMC-owned Syncplicity.
Suter points out the cloud file sharing market is young, and current contenders are still grappling with the best way to serve both users and companies. He said the goals for Copy are to facilitate “easier sharing, and to make it more secure than what’s out there, and company friendly.”
Under the company friendly category, he said Copy gives administrators the ability to create separate areas to keep proprietary company data. “Users can have an area for personal data, but there’s another areas for company data,” he said. “Companies can revoke access to company data.”
Barracuda is known mostly for its firewall products, but it does offer a hybrid backup service based on technology gained when it bought backup software vendor Yosemite Technologies in 2009. Some of that data protection technology is used for Copy.
And you can expect Barracuda to go deeper into data protection. BJ Jenkins joined Barracuda as CEO last October after running EMC’s backup and recovery division.
It’s a sign of the times that news of NetApp’s FlashRay all-flash storage system this week overshadowed its FAS6200 high-end disk array upgrade.
The FAS6200 is the highest performing and largest capacity platform of NetApp’s mainstream storage family. FlashRay won’t be available for another year, and probably won’t approach FAS6200 sales for years.
But flash storage is so much more interesting these days and, besides, it’s not every day that NetApp reveals it is developing a non-Data OnTap storage system.
The FAS6200 hardware isn’t that much different from the previous versions. The exceptions are that the new systems have substantially more memory and support 4 TB drives. The memory boost results in better performance and the larger drives bring the maximum cluster capacity to 65 PB. The FAS6220, 6250, and 6290 replace the FAS6210, 6240 and 6280 arrays and V-Series gateways.
The dual-controller 6220 holds 1,200 drives and 4.7 PB in a 6U chassis with 96 GB of memory. The 6250 and 6290 have two 6U chassis, and each system holds 1,440 drives and 5.6 PB. The 6250 has 144 GB of memory and the 6290 has 192 GB of memory.
Flash can play a big part in these systems, too. The 6290 holds up to 16 TB of total Flash Cache and Flash Pool capacity, the 6250 holds 12 TB of flash and the 6210 4 TB of flash. Flash Cache is controller based and optimizes performance of data throughout the array. Flash Pools accelerate performance of data inside a volume.
The FAS6200 series competes mostly with EMC’s VMAX 10K entry level enterprise system and the higher end of the midrange VNX family, IBM’s XIV and V7000, the larger of Hewlett-Packard’s StoreServ arrays, and Hitachi Data System’s Virtual Storage Platform (VSP) and Unified Storage-VM systems.
Three months after closing its StorSimple acquisition, Microsoft is still keeping its roadmap plans under wraps. The only sign of StorSimple integration so far is what Microsoft calls ASAP – the Azure storage acceleration program.
ASAP is a quick and easy way to purchase cloud storage using StorSimple’s controllers and the Microsoft Windows Azure cloud service. Customers can buy a StorSimple iSCSI storage controller with 50 TB or 100 TB of capacity provisioned to move data to the Azure cloud for a hybrid setup using on-premise and cloud storage. That means the purchase and provisioning are handled in one step instead of a customer having to engage StorSimple and a cloud provider separately.
Mark Weiner, a StorSimple executive and now a director of product marketing for Microsoft storage, said purchasing through ASAP lowers the cost of storage capacity by at least 60% versus traditional storage infrastructure.
Weiner said biggest change since the acquisition is StorSimple’s product has gone global under Microsoft. Before the sale, it was U.S.-focused. When asked if StorSimple still worked with other cloud providers, he said, “technically, there’s no reason why we can’t. But obviously we are focused on a joint solution with Azure, either purchased on ASAP or purchased separately.”
Weiner assures us that StorSimple is expanding and improving its technology under Microsoft, and Microsoft sees cloud storage as a big growth area.
“You will see a lot of ongoing innovation from StorSimple as part of Microsoft,” he said. “I still see my engineering colleagues late in the office, there’s no slowing down.”
Brocade’s new CEO Lloyd Carney said one of the reasons he took the job is because he sees an exciting future for Fibre Channel storage networks.
On Brocade’s earnings call Thursday, Carney spoke on the record extensively for the first time since replacing Mike Klayko as CEO last month. He said technologies such as virtualization, cloud and flash create more demand for FC, especially the newer 16 Gbps switches that Brocade has been selling since mid-2011. Brocade has had the 16-gig switch market to itself because its main rival Cisco has yet to upgrade from 8 Gbps. That is expected to change over the next few months, but Carney and other Brocade executives said Cisco’s entry to 16-gig should pump even more life into FC storage.
Although Brocade has spent a lot of resources developing the Ethernet switching business since acquiring Foundry Networks in 2008, it has continued to lead FC switching market share. Brocade has taken the approach that FC will continue as the dominant storage protocol for the foreseeable future while Cisco maintained the future of storage lies in Ethernet and converged Fibre Channel over Ethernet (FCoE) networks.
“One of the reasons I joined Brocade was it was clear to me from the outside looking in was that Fibre Channel wasn’t dead despite the cloud that our friends at Cisco had put on it,” Carney said. “FCoE didn’t take over the world, and Cisco has drawn back from that now that FCoE has become a bit player in the overall scheme of things. Every trend that’s out there in storage points back to Fibre Channel.
“Fibre’s not dead anymore … I’m confident in the growth of Fibre Channel.”
Carney’s last job was CEO of I/O virtualization startup Xsigo Systems, which Oracle acquired last year. Carney has a lot more experience in networking than storage but said the “SAN market continues to represent an exciting opportunity for Brocade.”
Jason Nolet, VP of Brocade’s Data Center Networking Group, said he welcomes Cisco’s move into 16-gig FC.
“We expect that product to be in the market in the first half of the year and, candidly, we’re excited to see that happen,” Nolet said. “We’ve been telling you guys quarter after quarter that the Fibre Channel market is alive and well and growing, and customers want to continue to invest. We’ve been a lone voice there until now. The fact that Cisco was pushing an Ethernet-only agenda and an FCoE agenda almost exclusively the last several years, and they’re now coming forth with a dedicated Fibre Channel product, is the best testament of all to the strength that remains in this market.”
Brocade reported $362 million in SAN product revenue last quarter — its highest ever — compared to $140.5 million from Ethernet networking and $86.5 million from services. And 42% of its storage revenue came from 16-gig directors and switches.
Brocade projects the demand for storage capacity will increase 37% per year over the next five years. “And as long as storage demands increase, the demand for Fibre Channel will also increase,” Carney said.
The use of solid state-based storage systems is rapidly increasing. So far, solid state technology has been deployed to accelerate applications in specific environments. Successes have been demonstrated for increasing the number of transactions a system can perform, the number of virtual machines per physical server (commonly referred to as virtual machine density), and the amount of virtual desktops supported by a storage system.
The continuing advance of all-solid state storage systems is leading to strategies where primary storage — defined as the most active storage of information for applications — will be all solid state. Planning for all-solid state primary storage requires the fabric infrastructure to be considered as a critical element in delivering the maximum value from solid state technology.
Solid state storage systems have much lower latency than systems design for spinning disks. The latency is measured in microseconds and the systems can sustain a much greater number of operations per second. Solid state technology is really a memory technology and using low-level disk-based access protocols may not be optimum. Faster or more streamlined protocols may reduce overhead and contribute to improvement in reducing latency.
So, what fabric interconnect is best? Most arguments about deploying new fabrics or making an infrastructure change have been based on cost. If the fabric interconnect technology requires additional hardware components to reduce latency, maybe the fabric technology with the lowest overhead or latency is the most economical. Economic valuation needs to be based on the increase in the efficiency of the system. Solid state systems can store and retrieve information faster, allowing applications to generate more transactions per second and deliver more value from the investment in servers and other hardware. The fabric choice to support solid state systems is a much bigger economic potential than the cost of the fabric or administration alone.
Low latency interfaces used today such as Fibre Channel or InfiniBand might deliver the greatest value when economic measures are used. Or, maybe another fabric or variation could evolve and create a disruption in the storage infrastructure. The economic value could make a compelling reason to make a transition.
Infrastructure for storage has always been a slow transition and that trend is expected to continue. But the efficiency and economic value delivered from solid state storage may accelerate a change of some type. The fabric decision will be based on how to enable applications to do more work and provide faster access to information. Initially, primary storage will be the focus for a fabric that can maximize value. Secondary storage may not be as demanding, but a common fabric may be preferred over a specialized fabric. Vendors will promote what they have now as a practical matter but will also look for the competitive advantage in delivering the economic value with future offerings. It just may take some time to evolve.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Alex Bouzari, CEO of HPC storage vendor DataDirect Networks (DDN), said that will change in 2013. He expects big data to come into the mainstream and drag HPC with it. He expects DDN to come for the ride into the mainstream after more than a decade of handling big data needs before anybody called it big data.
“High performance computing has come of age,” Bouzari said, “and it’s now called big data.
“Big data is really the democratization of high performance computing. What was limited to a small number of extreme requirements has become commonplace. Now we’re seeing big data and high growth data across markets – the web, cloud and commercial high performance segments.”
It could be wishful thinking on his part, but Bouzari said big data requirements have already spilled into the enterprise, especially financial services, healthcare and manufacturing. He expects that to result in more cloud storage implementations as companies struggle to store and analyze data for their businesses.
That could be a boon to object-storage systems built for cloud scale, such as DDN’s Web Object Scaler (WOS) system. Bouzari also sees a need to make Hadoop work better with storage built for big data.
“Customers say ‘we love what Hadoop can do for us, but we need it as a product that can solve a business problem,’” Bouzari said. “Customers are looking to maximize the performance of Hadoop. We can greatly accelerate Hadoop.”
Bouzari also sees flash playing a role in big data, although until now the price of solid state storage has made it cost benefit for smaller data sets that need high performance. Bouzari said he is starting to see that change among DDN’s customer base.
“All-flash storage for high performance computing is one of those things that seemed to make a lot of sense, but proved cost prohibitive in many environments,” he said. “We’re did some all-solid state deployments [in 2012] but they were typically deployed as part of much larger IT infrastructures. As the cost of non-volatile memory continues to decrease, the ability to use it to serve huge amounts of content will make solid-state more attractive.”