More than 70 vendors will be involved in the NVMe flash market by 2020 and the market will be worth $57 billion. Meanwhile, nearly 40% of all-flash arrays will be based on NVMe drives by 2020.
That’s according to research company G2M, which has predicted a compound annual growth rate for NVMe-based products of 95% per annum between 2015 and 2020.
But how accurate can those figures be?
The research predicts NVMe will make inroads across servers and storage hardware but also as storage networking equipment to speed NVMe across the likes of Fibre Channel and Ethernet (in so-called NVMe-over-fabrics).
The G2M report reckons more than 50% of enterprise servers will have NVMe bays by 2020, while that will be the case for 60% of enterprise storage appliances.
Meanwhile, it predicts “nearly 40%” of all-flash arrays will be NVMe-based by 2020.
There’s an interesting distinction here that reflects the current difficulties of realising the potential of NVMe flash. Namely, “NVMe bays” on the one hand and “NVMe-based arrays” on the other.
Because NVMe so radically re-works storage transport protocols it is currently held back by storage controller architectures, so cannot realise NVMe’s potential of tens or hundreds of x better performance than current SAS and SATA-based flash drives.
NVMe can slot in and realise its full potential as direct-attached storage – lending it to use in server and storage “NVME bays” – and some vendors have delivered what are effectively direct-attached arrays that lack features such as provisioning, RAID configuration, replication etc.
But the storage controller that must handle protocols, provisioning and more advanced storage functionality currently forms a bottleneck to NVMe use in storage arrays and so we are yet to see a true NVMe flash array hit the market.
So, when the G2M report predicts 40% of flash arrays being NVMe-based by 2020 it it very well as best guess. A guess that hedges it bets on perhaps, but not certainly, that some arrays will have cracked the NVMe-controller bottleneck, and/or that there will also be a number of products that run NVMe in less-than-optimum architectures and that there will also be some “arrays” that are effectively banks of direct-attached storage.
Nevertheless, the research is interesting and shows the perceptions that exist around the potential for NVMe flash.
The future of data storage will not be in the binary switching of electrical cells as in flash storage.
It may also not be in magnetism-based potential successors to flash such as Racetrack Memory, where one or a few bits per cell is replaced by 100 bits per physical unit of memory.
Instead, scientists are working on ways to mimic the way the human brain stores memories.
So far, things are still at the stage of trying to work out exactly how the brain does what it does, but the potential gains to be had of mimicking it are huge.
Current estimates are that the brain has a storage capacity of possibly several petabytes. Also, according to Professor Stuart Parkin, an experimental physicist, winner of the Millennium Prize in 2014 and IBM fellow, it is estimated the brain uses one million times less energy than silicon-based memory.
Science is still to come up with anything like a consensus for how memories are stored in the brain. It is thought – to simplify hugely – that it is the release and uptake of neurotransmitting chemicals (of different types) between brain cell (of different types, such as neurons) that are the vehicle for memories.
And that – a network of connections between which storage is shared and those connections are what defines the thing being stored – is the model for research by Professor Parkin, who is also director of research centres at Stanford University in the US and the Max Planck Institute at Halle in Germany.
He said: “What we’re looking to do is go beyond charge-based computing. We could be inspired by how biology computes, using neurons and sysnapses, with data stored in a distributed fashion and currents of ions manipulating information.”
“We believe the brain stores data by distributing it among synaptic connections,” he added. “We want to build a system of connections and learn how to store information on it.”
“That’s in contrast to how we do things now with individual devices. Instead it would be a network of connections and distributed storage of information among them, but built in a totally different way with, say, 1 bit of information stored between 20,000 different connections.”
Nominations are now open for The 2018 Millennium Technology Prize (also known as the Technology Nobels).
Violin Memory – an all-flash storage pioneer whose trials and tribulations we have followed here – finally hit what just about rock bottom for a tech company in the past months.
That is, Violin was declared bankrupt and in January was auctioned off over a period of three days with the winner at $14.5 million an investment vehicle of the Soros Group, Quantum Partners.
Violin had been a pioneer of flash storage and had raised $162 million at its IPO in 2013. But by last year it had lost in excess of $600 million and was heading for an unseemly demise.
In early 2017 it was set for the aforementioned fire sale, of which Quantum Partners was the winner.
One must presume that the investors believe they can turn the company around.
They have appointed a new CEO, namely Ebrahim Abbassi. He comes with credentials trumpeted by Violin of having rescued three other tech outfits from the doldrums, namely Redback (turned around and acquired by Ericsson in 2006), Force 10 (acquired by Dell in 2011) and Roamware (now Mobileum).
Mr Abbassi seems to have a record of taking companies and preparing them for acquisition, so maybe that’s what Quantum Partners hopes for with Violin.
He has been at Violin for more than a year now, as COO from March 2016 before landing the top job at the end of April 2017.
Was he not able to start the turnaround sooner? Perhaps what was needed was to rid the company of debts via bankruptcy and a sweep-out of the board. That was announced, with the presence of a couple of software experts but no hardware-experienced execs.
That might indicate the directions the company will take; restructure with a bias towards software innovation and aim for a profitable acquisition.
It’s arguable that Violin’s focus on its proprietary hardware flash modules was always a potential weakness. Perhaps it is even more so now as software-defined, hyper-converged and NVMe-based approaches are de rigeur.
It’ll be interesting to see what can be done with Violin. A potential competitor/acquisition target or destined ultimately for the Where Are They Now file?
The recent announcement by Cisco of 32Gbps capability in its MDS 9700 Director switch and UCS C-series server products means the two major storage networking hardware makers are now able to offer the next generation of Fibre Channel bandwidth to customers; Brocade already did last summer with its Gen 6 products.
With 768 ports and a maximum bandwidth of around 1.5Tbps, Cisco has targeted potential MDS 9700 customers as those with high bandwidth requirements, such as in virtualised environments, those using flash arrays, and to support NVMe flash storage.
For now they’re unlikely to make a lot of difference to those using NVMe arrays because simply replacing SAS and SATA drives in the array with NVMe drives means there’s still a bottleneck at the storage array controller.
But, Cisco and Brocade now also offer NVMe-over-Fibre Channel fabric connectivity. There aren’t any storage array products that can fully take advantage of NVMf-FC yet, but when they do the potential of flash storage will be hugely boosted.
That’s because NVMe – a PCIe-based protocol – offers huge performance gains for flash storage over existing SAS and SATA connection protocols.
For now mostly, however, NVMe-connected flash drives only realise their full potential in what are effectively direct-attached storage deployments.
Even with NVMf – which allows NVMe to be transported over Ethernet or Fibre Channel – NVMe can’t gain its full potential as back end storage and across a fabric/network as shared storage as we know it.
That’s because the gains brought by NVMe are nullified to a large extent by the storage array controller that provides protocol handling, storage addressing and provisioning, RAID, features such as data protection, data reduction etc.
That is, until storage controllers have sufficient processing power to not be a bottleneck.
But for now at least NVMe-capable drives exist, a transport (NVMf) exists and the two major storage networking suppliers support it. All we’re waiting for are the array vendors.
There are lots of scale-out, parallel file systems about, from those of the big six array makers such as NetApp’s clustered Ontap and EMC’s Isilon OneFS to the open source and distributions thereof of Linux Lustre, Red Hat’s GlusterFS etc.
But we have a new entrant in Elastifile, a software-only startup of Israeli origin that has built a new parallel file system from the ground up that it says can form a single namespace across on-prem and cloud locations. It aims to take on object storage, and in fact uses object representation to allow customers to burst workloads in the cloud.
It aims at traditional secondary storage use cases such as backup and restore but also analytics workloads.
Elastifile says its file system can scale from a minimum three nodes to potentially thousands, although it has only deployed 100. “So far we have found no limitations,” said CEO Amir Aharoni. “But we are working with billions of files. There’s no limit. We assume we can go to 1,000s of nodes.”
Ordinarily, scale-out storage begins to slow up as it reaches very large numbers of files as the tree-like hierarchy of the file system becomes cumbersome. Elastofile execs claim that their file system design distributes metadata so that there are no bottlenecks.
Replication is anything from 2-way upwards. Aharoni says they may add erasure coding at a later date but that this isn’t high on the company agenda because the file system was developed to use data reduction that is suited to replication rather than erasure coding.
The interesting bit is that the Elastofile file system can extend to the cloud where files are represented as native scale-out file for active data or objects when inactive. When a customer wants to burst a workload to the cloud they can “check out” data from that state and run, for example, analytics on it in the cloud in file format. Then when finished they check the data back in to object representation mode.
Aharoni gave an example of a microchip designer that does “lift and shift” to the cloud in that way.
So, for some use cases the aim is access to data that’s not particularly rapid and possibly infrequent and where storage in the cloud would be cost effective.
Aharoni said the company is aiming at scientific analytics, financial services, oil & gas.
The NV24P allows for up to 24 2.5” NVMe-mounted flash drives of up to 8TB for a total capacity of 184TB.
So, what the Zstor JBOF provides is direct-attached storage (DAS) for hosts with none of the functionality expected of a shared storage array.
Without giving any exact figures Zstor promises access times similar to those for server-located flash, which must mean in the low hundreds or tens of milliseconds.
The Zstor NV24P illustrates the current limits of NVMe, a PCI-based standard that potentially allows flash storage to break through the limits imposed by SCSI and its use in the disk-era SAS and SATA protocols.
In other words, NVMe is a direct slot-in for PCIe flash in the server with no performance hit. With NVMf, using for example RDMA – a direct connection as if memory – as here, it can also provide near server SSD performance.
But it cannot provide storage addressing and provisioning, RAID, features such as data protection, data reduction etc that have been traditionally provided by a controller.
And so this is effectively a DAS solution, albeit one with uses because to add a controller would lop off the benefits in performance that NVMe brings as it carries out the processing needed to do them.
This controller bottleneck is an obstacle in the road to fulfilling the potential of NVMe in a true shared storage array format.
We await the vendor that can build in the functionality brought by a controller but with the power to overcome any hit to performance.
The top five in storage continue to see declining revenues, but for IBM storage seems to have been worse than for its competitors.
This week Storage Newsletter published an aggregation of financial results that included the top five storage array makers.
Its findings the following in terms of ranking, with all vendors where comparisons were possible showing a decline in revenues.
#1 – EMC with storage revenue of $16.3 million in 2015 and no equivalent figure for 2016 due to Dell EMC not publishing those figures.
#2 – NetApp, with revenues of $6.123 million in 2015, and $5.546 million in 2016, a decline of 9% year-on-year.
#3 – Hitachi Data Systems, with revenue of $4.079 million in 2015, with no figures for 2016 noted. Revenue decline for 2014 to 2015 was recorded as -4%.
# 4 – HPE, with 2015 revenue of $3.180 and 2016 revenue of $3.065, a decline of 4% year-on-year.
# 5 – IBM, with 2015 revenue of $2.4 million, down to $2.184 in 2016, a decline of 9%.
That’s just a snapshot of what we know already. The “traditional” storage array market is in a period of long-term decline, with hyper-converged, hyperscale and converged systems making good progress. See here for a wider analysis.
An interesting comparison is to put these rankings against those of four years ago.
Here it becomes apparent that a big loser is IBM.
In 2011 we noted the following rankings from Gartner figures:
* EMC: $6.279 billion and 32% market share
* IBM: $3 billion and 14.2%
* NetApp: $2.45 billion and 11.5%
* HP: $2.07 billion and 9.8%
* HDS: $1.99 billion and 9.4%
Aside from EMC’s gathering of an even greater proportion of storage revenues, the big change there is IBM’s drop down the rankings.
Why that happened may be the subject of future blogs, but I’m very happy to hear your views on the subject in the meantime.
Go back 10 or 20 years and direct-attached disk was the norm. IE, just disk in a server.
It all became a bit unfashionable as the virtualisation revolution hit datacentres. Having siloed disk in servers was inherently inefficient and server virtualisation demanded shared storage to lessen the I/O blender effect.
So, shared storage became the norm for primary and secondary storage for many workloads.
But in recent years, we saw the rise of so-called hyperscale computing. Led by the web giants this saw self-contained nodes of compute and storage aggregated in grid-like fashion.
Unlike enterprise storage arrays these are constructed from commodity components and an entire server/storage node swapped out if faulty, with replication etc handled by the app.
The hyperscale model is aimed at web use cases and in particular the analytics – Hadoop etc – that go with it.
Hyperscale, in turn, could be seen as the inspiration for the wave of hyper-converged combined server and storage products that has risen so quickly in the market of late.
Elsewhere, however, the need for very high performance storage has spawned the apparently somewhat paradoxical direct-attached storage array.
Key to this has been the ascendance of NVMe, the PCIe-based card interconnect that massively boosts I/O performance over the spinning disk-era SAS and SATA to something like matching the potential of flash.
From this vendors have developed NVMe over fabric/network methods that allow flash plus NVMe connectivity over rack-scale distances.
Key vendors here are EMC with its DSSD D5, E8 with its D24, Apeiron, Mangstor, plus Excelio and Pavilion Data Systems.
What these vendors offer is very high performance storage that acts as if it is direct-attached in terms of its low latency and ability to provide large numbers of IOPS.
In terms of headline figures – supply your own pinches of salt – they all claim IOPS in the up to 10 million range and latency of <100μs.
That’s made possible by taking the storage fabric/network out of the I/O path and profiting from the benefits of NVMe.
In some cases vendors are taking the controller out of the data path too to boost performance.
That’s certainly the case with Apeiron – which does put some processing in HBAs in attached servers but leaves a lot to app functionality – and seems to be so with Mangstor.
EMC’s DSSD has dual “control modules” that handle RAID (proprietary “Cubic RAID”) and presumably DSSD’s internal object-based file layout. E8 appears to run some sort of controller for LUN and thin provisioning.
EMC and Mangstor run on proprietary drives while E8 and Apeiron use commodity cards.
A question that occurs to me about this new wave of “shared DAS” is: Does it matter whether the controller is taken out of the equation?
I tend to think that as long as the product can deliver raw IOPS in great numbers then possibly not.
But, we’d have to ask how the storage controller’s functions are being handled. There may be implications.
A storage controller has to handle – at a minimum – protocol handling and I/O. On top of that are LUN provisioning, RAID, thin provisioning, possibly replication, snapshots, data deduplication and compression.
All these vendors have dispensed with the last two of these, and mangstor and Apeiron have ditched most of the rest, Apeiron, for example, offloading much to server HBAs and the app’s own functionality.
So, a key question for potential customers should be over how the system handles controller-type functionality. The more processing that is done over and above the fundamentals has to be done somewhere and potentially hits performance, so is there over-provision of flash capacity to keep performance up while the controller saps it?
Another question is, despite the blistering performance possible with these shared NVMe-based DAS systems, will it be right for leading/bleeding edge analytics environments?
The workloads aimed at – such as Hadoop but also Splunk and Spark – are intensely memory hungry and want their working dataset all in one place. If you’re still having to hit storage – even the fastest “shared” storage around – will it make the grade for these use cases or should you be spending money on more memory (or memory supplement) in the server?
There was a dodgy* old joke about a glass of beer that re-filled itself when you had drunk it. The unwritten premise was that that’s what everyone (well, men in the 1970s, I presume) would want if they could get it.
But, what would storage managers wish for if they could get it? Something similar, one would think, given the ever-present headache of data growth. IE, storage capacity that is easily scalable, usually upwards, but downwards too when you need it. IE, cloud storage.
Well, Zadara Storage – which makes software-defined storage that can be used in the cloud – asked 400 people who manage storage in the UK, US, and Germany what their #1 wish in 2017 would be.
The largest chunk (33.25%) answered “cloud storage that scales up or down according to my organisation’s needs”, with no appreciable difference in results between the three countries.
That wish was expressed by about three times more people than opted for “new storage hardware to hold my organisation’s data”, although there was a significant difference between the two sides of the Atlantic here, with 10% of UK and Germany respondents desirous of more in-house capacity while 16% of those in the US wanted more hardware.
The main takeaway, I think, is that easily scalable storage is the key thing storage managers want.
Perhaps more profound is the assumption that that can only be found in the cloud.
This looks like a harbinger of things to come and that the rise of the cloud is inevitable.
Currently, a lack of guarantees over latency and availability (no-one can guarantee against a cable getting dug up, for example) mean the cloud is becoming more popular but is not trusted for the most performance-hungry storage operations.
Despite that, the survey tells us customers want storage that can scale easily and that the cloud is where it will likely come from.
In the long term this will bring major changes in data storage, with cloud providers profiting from economies of scale in terms of buying storage hardware, and with capacity delivered via the cloud for all but perhaps the most sensitive workloads.
* The joke, as I remember it, told in Britain in the 1970s, took a poke at the Irish. In it, an Irishman was offered three wishes. He asked for a glass of beer that re-filled itself. For his second wish, he asked for another.
Given the issues around data portability between cloud providers the joke could be successfully adapted to one about a storage manager offered cloud storage capacity that scaled itself. IE, you’d be crazy to want another one.
And it is predicted by Gartner that that rise will continue, with hyper-converged product sales set to more than double by 2019 to around $5 billion.
As that happens, according to the analyst house, hyper-converged infrastructure will increasingly break out from hitherto siloed applications and become a mainstream platform for application delivery.
Part of that evolution surely has to be the inclusion of NVMe connectivity, which should be a shoo-in for hyper-converged.
NVMe is another rapidly rising storage star. It’s a standard based on PCIe, and offers vastly improved bandwidth and performance to drives than existing SAS and SATA connections.
In other words it’s a super-rapid way of connecting drives via PCIe slots.
Surely though, widespread adoption is just a matter of time. With its PCIe connectivity, literally slotting in, NVMe offers the ability to push hyper-converged utility and scalability to wider sets of use cases than currently.
There are some vendors that focus on their NVMe/hyper-converged products, such X-IO (Axellio), Scalable Informatics, and DataON, but NVMe as standard in hyper-converged is almost certainly a trend waiting to happen.