In case you missed it, there’s been an entertaining exchange going on between EMC, NetApp and even IBM bloggers over a bug in NetApp’s SnapLock software.
It all started when EMC’er Scott Waterhouse of The Backup Blog got his hands on a notification from NetApp to customers urging an upgrade to OnTap 7.2.5 to resolve a vulnerability in SnapLock’s WORM functionality. Waterhouse didn’t go into much detail about exactly what the bug was and quoted selectively from the customer-notification document:
“…versions of Data ONTAP prior to 7.2.5 with SLC have been found to have vulnerabilities that could be exploited to circumvent the WORM retention capability.” They go on to say: “NetApp cannot stand by the SnapLock user agreement unless the upgrade is performed.”
Now this is a really big deal. This is not a trivial little upgrade to OnTAP. This is a big one.
Predictably, he then segued into a sales pitch–“Maybe it is time to explore an alternative?”–without giving much more information about what the problem actually was, or why exactly the upgrade between dot releases of OnTap isn’t trivial.
Waterhouse’s take was then picked up by Mark Twomey, aka StorageZilla, who led with the headline, “NetApp SnapLock Badly Broken.” Twomey also emphasized the fear angle “Right now none of those who aren’t running 7.2.5 or above are not compliant and it turns out they never were“ without divulging further details about the problem.
This is about where I came in. I tried pinging Twomey to no avail; I also tried hitting up some of the folks on Toasters. the NetApp users forum, to see what they’d heard. I planned to ping NetApp as well, but if the bug was as bad as the EMC’ers were making it out to be, I didn’t expect them to be willing to talk about it.
They surprised me by contacting me before I could get to them, and last Friday chief technical architect Val Bercovici gave me NetApp’s side of the story, telling me, “We expanded our testing on SnapLock to a third class of protection from tampering with the WORM feature.”
The first two classes, which had already been tested, concerned protection against malicious end-user removal of data, as well as protection from malicious administrative actions. The third and most recent class tested against was a case “where knowledge of the source code combined with some other products that are out there could be used for data deletion” inside SnapLock. Bercovici also didn’t want to give all the gory details, saying the vulnerability had not been exploited in the field, and NetApp wanted to keep it that way.
“It’s a highly unusual case, and in any event would be an audited deletion from the system,” Bercovici said. “It’s a level of testing EMC has never done” with Centera, he added.
Not quite “not compliant and never were”. NetApp bloggers were all over the EMC bloggers last week about the tone of their blog posts. It had begun to seem like the EMC-NetApp rivalry had faded a bit, as both companies go up against new competitors and find themselves with bigger fish to fry. But this was just like old times.
Things have gotten so moody so fast that blogger Tony Pearson from NetApp big brother IBM felt the need to tell EMC to pick on someone their own size:
I was going to comment on the ridiculous posts by fellow bloggers from EMC about SnapLock compliance feature on the NetApp, but my buddies at NetApp had already done this for me, saving me the trouble. . . .The hysterical nature of writing from EMC, and the calm responses from NetApp, speak volumes about the cultures of both companies.
But wait, there’s more. Remember how I mentioned heading over to see what was being discussed about this on Toasters? While there I ran across a thread that mentioned OnTap 7.2.5, and contained another message from NetApp to its customers:
Please be aware that we are investigating a couple of issues with quotas in Data ONTAP 7.2.5. As a precautionary measure, we have removed Data ONTAP 7.2.5 from the NOW site as we investigate the issues. We will provide an update as soon as more information is available.”
According to Bercovici, OnTap 7.2.5, issued as a bugfix for SnapLock, had its *own* bug, this time one that caused quota panic in some filers. In other words, the bugfix NetApp issued for what it said was an esoteric issue spawned another bug, and this time it caused some filers to ‘blue-screen’, according to the Windows analogy Bercovici used to describe the problem to me.
Version 188.8.131.52, which purportedly fixes both bugs, has since come out. As far as I’m concerned, the whole SnapLock bug was a tempest in a teapot, but NetApp still came out of this whole thing with egg on its face, as 7.2.5 introduced a severe and immediate problem in what seems like a well-intentioned effort to protect customers from an obscure corner-case hack. Also, they wound up with multiple EMC bloggers doing the Web equivalent of throwing chairs at them, a la Springer. As they say, no good deed goes unpunished. . . .