By Nicole Laskowski, Senior News Writer
SAN DIEGO – Security, ethics and privacy hover on the outskirts of the big data discussion; they are not ignored, but they don’t often appear in the spotlight, either. That’s why it was surprising to see an entire session devoted to all three at this week’s Gartner Catalyst conference. Sure, the session came at the end of the first day during a quick, 30-minute time slot; still, analysts Ramon Krikken and Ian Glazer were bucking a trend by giving these topics a forum.
Even better, the Gartner Catalyst presentation had a razor sharp focus, highlighting three big data mistakes and providing tips on how to best protect data. The big takeaway: Big data requires unflinching honesty — be it when defining what it is, how to best protect it or what it might be worth to the company.
1: Big data “weight”
Because businesses still struggle to define big data, they tend to grasp onto data volume, ignoring its other facets, such as the speed and variety of the data, according to Krikken. That lack of understanding can lead businesses down a path of rationalization when it comes to security and privacy: Either they think they don’t have enough data to warrant new or different kinds of data security measures or they make the big data mistake of thinking they have so much data, the weight alone provides a kind of protection. “It’s like a big haystack and [attackers are] looking for a needle, so our data is fine,” Krikken said, repeating what he’s heard from some of his clients.
That thinking will get businesses into trouble, Krikken said, because even with “small” data, “it can still have some big properties.”
2: Everything and nothing is new
Sometimes businesses see big data and think they need to scrap what they have and start fresh with new technologies, vendors and processes; sometimes, businesses fall prey to believing that what they have in place is big-data ready, according to Glazer. The same holds true for security and privacy measures, where Glazer has observed “fairly watered down” recommendations. A classic big data mistake involves risk assessment. He compared the typical company risk assessment to a dentist saying “OK, rinse.”
“This is not a useful instruction,” he said. “It doesn’t give an indication of where I am in the process.”
Big data requires a mix of the old and the new measures. Before embarking on a big data initiative, businesses should already have data governance and security measures in place, Glazer said. So the introduction of big data doesn’t require a new governance model — it just means building out what’s already there. “When it comes to where the rubber meets the road, I may have an implementation detail that’s specific to my Hadoop environment,” he said. “But I’m still within the data protection program that I, frankly, always have been building upon.”
3: Privacy vs. utility
This was the stickiest of the three paradoxes — and, not surprisingly, the one with an ethical bent. There’s a tendency to want to scrub the data of any personally identifying characteristics, but sometimes too much scrubbing of the data can render it useless.
“The problem is once somebody loses anonymity, it cannot be regained,” said Krikken.
Glazer gave an example of how geolocation data (which he called some of the most “privacy invasive data” out there) can be more unique than a fingerprint. Krikken countered the argument with a story about how scrubbed data from a small-scale study covered up a correlation between the arthritis drug Vioxx and heart attacks.
So businesses have to make choices: How much privacy can they provide without losing all of the data’s utility and at the same time not breach a legal threshold that could damage their reputation? It’s a tightrope walk, the analysts said, that may have to be made on a case by case — and a data set by data set — basis.
Plotting the terrain
One way to have this conversation — on both a legal and a cultural level — is to visualize the terrain. Create a chart of the privacy landscape that places risk on one axis and geographic region on another, Grazer said. The chart should include a line of demarcation between high risk, which indicates a lack of compliance or an action that could hurt the brand, and low risk, which indicates legal compliance or feelings of good will toward the brand. Think about controls deployed, best practices and how risk may vary from one region to the next and plot it out along the chart.
A tool like this can help businesses evaluate how they stack up against the competition, but it can also help to remove the surprise of regulatory wrongdoing or actions that may harm the company’s reputation. With big data, “we don’t know the totality of the data sets we’re working with,” Grazer said. “And especially when we use external data, we don’t want a surprise about expectations of the use of that kind of information to a specific geography.”
On the flip side, it can also help businesses identify cases where a certain amount of risk may actually be worth it.