There was a time not too long ago when backup software pretty much only handled one scenario, ie backing up physical servers.
Then came virtualisation. And for quite some time the long-standing backup software providers – Symantec, EMC, IBM, Commvault et al – did not support it, while newcomers like Veeam arose to specialise in protecting virtual machines and gave the incumbents a shove.
Then we had the rise of the cloud, and initially in backup products this was an option as a target.
But as the cloud provided a potential off-site repository in which to protect data it also became the site for business applications as a service.
That meant the cloud became a backup source.
There is some data protection capability in the likes of Office 365 but this doesn’t always fulfil an organisation’s needs.
There’s a the risk of losing access to data via a network outage, and there are compliance needs that might require, for example, an e-discovery process. Or there’s simply the need to make sure data is kept it in several locations.
So, companies like Veeam now allow a variety of backup options for software services like Office 365.
You can, for example, use Veeam to bring data back from the cloud source to the on-prem datacentre as a target. That way you can run processes such as e-discovery that would be difficult or impossible in the cloud application provider’s environment.
Or you can backup from cloud source to cloud target. This could be to a service provider’s cloud services, or to a repository built by the customer in Azure, AWS etc. Either option might enable advanced search and cataloguing to be made easier, or might simply provide a secondary location.
With the possibility of backup of physical and virtual machines in the datacentre and the cloud and then spin-up to recover from any of these locations, full interoperability between environments is on the horizon.
For now the limits are beyond those of the backup product, assuming it has full physical-to-virtual interoperability, but are those of the specific scenario. A very powerful dedicated physical server running high performance transactional processing for a bank, for example, could likely not be failed over to the cloud.
But nevertheless, the trends in backup indicate a future where the site of compute and storage can slide seamlessly between cloud and on-prem locations.
Veritas dates back to the early 1980s but disappeared for 10 years to become part of Symantec, with its NetBackup and Backup Exec leading the way in the data protection market.
But, in 2014 Veritas was burped out by Symantec into an environment where its leadership in backup could no longer be taken for granted.
The virtualisation revolution had transformed the likes of Veeam into mainstream contenders while newcomers and also-changed rivals such as Rubrik, Cohesity and ArcServe started snapping at heels.
And while the virtualisation transformation had largely done its work, other long waves started to break.
These were: The drive towards big data and analytics, which is also being driven by the upsurge of machine/remote data; a greater need for compliance, driven in particular by regulations such as Europe’s GDPR, and; the emergence of mobile and the cloud as platforms operating in concert/hybrid with in-house/datacentre IT environments.
Such changes appear to have driven Veritas to focus on “broad enterprise data management capabilities”, according to EMEA head of technology, Peter Grimmond.
According to Grimmond, Veritas’s thinking centres on four aims, namely: Data protection, ie backup and archiving; data access and availability, ie ensuring the workload can get to the data; gaining insight from the organisation’s data, and; monetising that data if possible.
Its product set fits with those general aims, with data protection and availability products (NetBackup, Backup Exec, plus hardware backup appliances); software-defined storage products (file and object storage),and; tools to help with information governance (data mapping and e-discovery tools, for example).
Compliance and the kind of data classification tasks that arise from it are strong drivers for Veritas right now.
“We are particularly focussed on unstructured data and how that can pile up around the organisation,” said Grimmond. “And whether that is a risk or of value to the organisation.”
That’s of particular use in, for example, any kind of e-discovery process, and as part of regulatory requirements such as for Europe’s GDPR. This gives the customer the “right to be forgotten” following a transaction, which for organisations can mean it needs to locate personal data and do what is necessary with it.
Veritas has also built in intelligence to its storage products. Its object storage software product – announced recently at its Vision event – for example, incorporates its data classification engine so that data is logged, classified and indexed as it is written.
This functionality has in mind, for example, Internet of Things and point-of-sale scenarios, said Grimmond.
DR in the cloud is available. Options exist that range from simply using cloud backup and recovering from that to customer infrastructure on-prem or in the cloud, to full-featured DraaS offerings.
“We’re looking at how to leverage the public cloud to do rapid recovery,” said Alon Yaffe, product management VP at Barracuda.
“The way cloud disaster recovery exists, the industry is asking customers to go with a certain vendor for everything,” said Yaffe. “But, there will be advantages for customers to make use of the public cloud and do it on their own.”
If they do that though, how does Barracuda benefit? After all, it makes its living providing services and products in this sphere.
According to Yaffe it will be by offering the intelligence to help orchestrate disaster recovery that customers put together using public cloud services. The company will be working on that in the next couple of years, said Yaffe, with the aim of providing orchestration for on- and off-site DR functionality.
That presumably means the type of orchestration that can allocate and provision data and storage – within the bounds of RTOs and RPOs – and make it available following an outage, in a fully access-controlled fashion so that customers can build DIY disaster recovery infrastructure from a mix of public cloud and on-prem equipment.
It’s an adaptation of the idea of cloud orchestration to the sphere of DR and should be a valuable addition to the datacentre.
Backup appliance maker Rubrik plans to add analytics to its products, including in the cloud.
Talking to ComputerWeekly.com this week CEO Bipul Sinha would not give details, but did say the company plans to add analytics, and not restricted to those that report on backup operations but more widely using metadata captured in backup and archive operations.
“To date what Rubrik has done has been to manage data backup, recovery, archiving. Going forward we’re looking at more analytics and reporting, doing more with the content stored,” he said.
Sinha he felt Rubrik had won customer trust with its scale-out appliance offerings and that now the company, “wanted to give more intelligence” and that its analytics would enable customers to “interrogate data to gain useful business information.”
The Rubrik CEO also said: “There’s a definite trend to making one single platform on premises and across the cloud” and said that any analytics functionality offered by the company would span the two.
“Competing legacy companies have not innovated so it’s breaking new ground,” he added.
That’s not strictly true, as Druva claims e-discovery and data trail discovery functionality with its inSynch product.
And backup behemoth Veritas recently added functionality that uses machine learning to ID sensitive and personal data to help with GDPR compliance.
To date though, the extent of analytics functionality in backup products has been limited, and some question to what extent backup and analytics can be merged, so we’ll have to wait and see what Rubrik comes out with.
Rubrik provides flash-equipped backup appliances that can scale out and which support most physical and virtual platforms, including the Nutanix AHV hypervisor.
NVMe offers huge possibilities for flash storage to work at its full potential, at tens or hundreds of times what is possible now.
But, but it’s early days, and there is no universally-accepted architecture to allow the PCIe-based protocol for flash to be used in shared storage.
Several different contenders are shaping up, however. We’ll take a look at them, but first a recap of NVMe, its benefits and current obstacles.
Presently, most flash-equipped storage products rely on methods based on SCSI to connect storage media. SCSI is a protocol designed in the spinning disk era and built for the speeds of HDDs.
NVMe, by contrast, was written for flash, allows vast increases in the number of I/O queues and the depth of those queues and enables flash to operate at orders of magnitude greater performance.
But NVMe currently is also roadblocked as a shared storage medium.
You can use it to its full potential as add-in flash in the server or storage controller, but when you try to make it work as part of a shared storage setup with a controller, you start to bleed I/O performance.
That’s because – consider the I/O path here from drive to host – the functions of the controller are vital to shared storage. At a basic level the controller is responsible for translating protocols and physical addressing, with the associated tasks of configuration and provisioning of capacity, plus the basics of RAID data protection.
On top of this, most enterprise storage products also provide more advanced functionality such as replication, snapshots, encryption and data reduction.
NVMe can operate at lightning speeds when data passes through un-touched. But, put it in shared storage and attempt to add even basic controller functionality and it all slows down.
Some vendors, for example, Pure in its FlashArray//X, have said to hell with that for now and put NVMe into their arrays with no change to the over all I/O path. They gain something like 3x or 4x over existing flash drives.
So, how is it proposed to overcome the NVMe/controller bottleneck?
On the one hand we can wait for CPU performance to catch up with NVMe’s potential speeds, but that could take some time.
On the other hand, some – Zstor, for example – have decided not to chase controller functionality, with multiple NVMe drives offered as DAS, with NVMf through to hosts.
A different approach has been taken by E8 and Datrium, with processing required for basic storage functionality offloaded to application server CPUs.
Apeiron similarly offloads to the host, but to server HBAs and application functionality.
But elsewhere, controller functionality is seen as highly desirable and ways of providing it seem to be focussing distribution of controller function processing between multiple CPUs.
Kaminario’s CTO Tom O’Neill has IDed the key issue as the inability of storage controllers to scale beyond pairs, or even if they can nominally, to actually become pairs of pairs as they scale. For O’Neill the key to unlocking NVMe will come when vendors can offer scale-out clusters of controllers that can bring enough processing power to bear.
Meanwhile, hyper-converged infrastructure (HCI) products have been built around clusters of scaled-out servers and storage. Exelero has built its NVMesh around this principle, and some kind of convergence with HCI could be a route to providing NVMe with what it needs.
So, with hyper-converged as a rising star of the storage market already, could it come to the rescue for NVMe?
More than 70 vendors will be involved in the NVMe flash market by 2020 and the market will be worth $57 billion. Meanwhile, nearly 40% of all-flash arrays will be based on NVMe drives by 2020.
That’s according to research company G2M, which has predicted a compound annual growth rate for NVMe-based products of 95% per annum between 2015 and 2020.
But how accurate can those figures be?
The research predicts NVMe will make inroads across servers and storage hardware but also as storage networking equipment to speed NVMe across the likes of Fibre Channel and Ethernet (in so-called NVMe-over-fabrics).
The G2M report reckons more than 50% of enterprise servers will have NVMe bays by 2020, while that will be the case for 60% of enterprise storage appliances.
Meanwhile, it predicts “nearly 40%” of all-flash arrays will be NVMe-based by 2020.
There’s an interesting distinction here that reflects the current difficulties of realising the potential of NVMe flash. Namely, “NVMe bays” on the one hand and “NVMe-based arrays” on the other.
Because NVMe so radically re-works storage transport protocols it is currently held back by storage controller architectures, so cannot realise NVMe’s potential of tens or hundreds of x better performance than current SAS and SATA-based flash drives.
NVMe can slot in and realise its full potential as direct-attached storage – lending it to use in server and storage “NVME bays” – and some vendors have delivered what are effectively direct-attached arrays that lack features such as provisioning, RAID configuration, replication etc.
But the storage controller that must handle protocols, provisioning and more advanced storage functionality currently forms a bottleneck to NVMe use in storage arrays and so we are yet to see a true NVMe flash array hit the market.
So, when the G2M report predicts 40% of flash arrays being NVMe-based by 2020 it it very well as best guess. A guess that hedges it bets on perhaps, but not certainly, that some arrays will have cracked the NVMe-controller bottleneck, and/or that there will also be a number of products that run NVMe in less-than-optimum architectures and that there will also be some “arrays” that are effectively banks of direct-attached storage.
Nevertheless, the research is interesting and shows the perceptions that exist around the potential for NVMe flash.
The future of data storage will not be in the binary switching of electrical cells as in flash storage.
It may also not be in magnetism-based potential successors to flash such as Racetrack Memory, where one or a few bits per cell is replaced by 100 bits per physical unit of memory.
Instead, scientists are working on ways to mimic the way the human brain stores memories.
So far, things are still at the stage of trying to work out exactly how the brain does what it does, but the potential gains to be had of mimicking it are huge.
Current estimates are that the brain has a storage capacity of possibly several petabytes. Also, according to Professor Stuart Parkin, an experimental physicist, winner of the Millennium Prize in 2014 and IBM fellow, it is estimated the brain uses one million times less energy than silicon-based memory.
Science is still to come up with anything like a consensus for how memories are stored in the brain. It is thought – to simplify hugely – that it is the release and uptake of neurotransmitting chemicals (of different types) between brain cell (of different types, such as neurons) that are the vehicle for memories.
And that – a network of connections between which storage is shared and those connections are what defines the thing being stored – is the model for research by Professor Parkin, who is also director of research centres at Stanford University in the US and the Max Planck Institute at Halle in Germany.
He said: “What we’re looking to do is go beyond charge-based computing. We could be inspired by how biology computes, using neurons and sysnapses, with data stored in a distributed fashion and currents of ions manipulating information.”
“We believe the brain stores data by distributing it among synaptic connections,” he added. “We want to build a system of connections and learn how to store information on it.”
“That’s in contrast to how we do things now with individual devices. Instead it would be a network of connections and distributed storage of information among them, but built in a totally different way with, say, 1 bit of information stored between 20,000 different connections.”
Nominations are now open for The 2018 Millennium Technology Prize (also known as the Technology Nobels).
Violin Memory – an all-flash storage pioneer whose trials and tribulations we have followed here – finally hit what just about rock bottom for a tech company in the past months.
That is, Violin was declared bankrupt and in January was auctioned off over a period of three days with the winner at $14.5 million an investment vehicle of the Soros Group, Quantum Partners.
Violin had been a pioneer of flash storage and had raised $162 million at its IPO in 2013. But by last year it had lost in excess of $600 million and was heading for an unseemly demise.
In early 2017 it was set for the aforementioned fire sale, of which Quantum Partners was the winner.
One must presume that the investors believe they can turn the company around.
They have appointed a new CEO, namely Ebrahim Abbassi. He comes with credentials trumpeted by Violin of having rescued three other tech outfits from the doldrums, namely Redback (turned around and acquired by Ericsson in 2006), Force 10 (acquired by Dell in 2011) and Roamware (now Mobileum).
Mr Abbassi seems to have a record of taking companies and preparing them for acquisition, so maybe that’s what Quantum Partners hopes for with Violin.
He has been at Violin for more than a year now, as COO from March 2016 before landing the top job at the end of April 2017.
Was he not able to start the turnaround sooner? Perhaps what was needed was to rid the company of debts via bankruptcy and a sweep-out of the board. That was announced, with the presence of a couple of software experts but no hardware-experienced execs.
That might indicate the directions the company will take; restructure with a bias towards software innovation and aim for a profitable acquisition.
It’s arguable that Violin’s focus on its proprietary hardware flash modules was always a potential weakness. Perhaps it is even more so now as software-defined, hyper-converged and NVMe-based approaches are de rigeur.
It’ll be interesting to see what can be done with Violin. A potential competitor/acquisition target or destined ultimately for the Where Are They Now file?
The recent announcement by Cisco of 32Gbps capability in its MDS 9700 Director switch and UCS C-series server products means the two major storage networking hardware makers are now able to offer the next generation of Fibre Channel bandwidth to customers; Brocade already did last summer with its Gen 6 products.
With 768 ports and a maximum bandwidth of around 1.5Tbps, Cisco has targeted potential MDS 9700 customers as those with high bandwidth requirements, such as in virtualised environments, those using flash arrays, and to support NVMe flash storage.
For now they’re unlikely to make a lot of difference to those using NVMe arrays because simply replacing SAS and SATA drives in the array with NVMe drives means there’s still a bottleneck at the storage array controller.
But, Cisco and Brocade now also offer NVMe-over-Fibre Channel fabric connectivity. There aren’t any storage array products that can fully take advantage of NVMf-FC yet, but when they do the potential of flash storage will be hugely boosted.
That’s because NVMe – a PCIe-based protocol – offers huge performance gains for flash storage over existing SAS and SATA connection protocols.
For now mostly, however, NVMe-connected flash drives only realise their full potential in what are effectively direct-attached storage deployments.
Even with NVMf – which allows NVMe to be transported over Ethernet or Fibre Channel – NVMe can’t gain its full potential as back end storage and across a fabric/network as shared storage as we know it.
That’s because the gains brought by NVMe are nullified to a large extent by the storage array controller that provides protocol handling, storage addressing and provisioning, RAID, features such as data protection, data reduction etc.
That is, until storage controllers have sufficient processing power to not be a bottleneck.
But for now at least NVMe-capable drives exist, a transport (NVMf) exists and the two major storage networking suppliers support it. All we’re waiting for are the array vendors.
There are lots of scale-out, parallel file systems about, from those of the big six array makers such as NetApp’s clustered Ontap and EMC’s Isilon OneFS to the open source and distributions thereof of Linux Lustre, Red Hat’s GlusterFS etc.
But we have a new entrant in Elastifile, a software-only startup of Israeli origin that has built a new parallel file system from the ground up that it says can form a single namespace across on-prem and cloud locations. It aims to take on object storage, and in fact uses object representation to allow customers to burst workloads in the cloud.
It aims at traditional secondary storage use cases such as backup and restore but also analytics workloads.
Elastifile says its file system can scale from a minimum three nodes to potentially thousands, although it has only deployed 100. “So far we have found no limitations,” said CEO Amir Aharoni. “But we are working with billions of files. There’s no limit. We assume we can go to 1,000s of nodes.”
Ordinarily, scale-out storage begins to slow up as it reaches very large numbers of files as the tree-like hierarchy of the file system becomes cumbersome. Elastofile execs claim that their file system design distributes metadata so that there are no bottlenecks.
Replication is anything from 2-way upwards. Aharoni says they may add erasure coding at a later date but that this isn’t high on the company agenda because the file system was developed to use data reduction that is suited to replication rather than erasure coding.
The interesting bit is that the Elastofile file system can extend to the cloud where files are represented as native scale-out file for active data or objects when inactive. When a customer wants to burst a workload to the cloud they can “check out” data from that state and run, for example, analytics on it in the cloud in file format. Then when finished they check the data back in to object representation mode.
Aharoni gave an example of a microchip designer that does “lift and shift” to the cloud in that way.
So, for some use cases the aim is access to data that’s not particularly rapid and possibly infrequent and where storage in the cloud would be cost effective.
Aharoni said the company is aiming at scientific analytics, financial services, oil & gas.