The term Big Data has been flying around the enterprise for the last two years or so, simultaneously creating a lot of excitement but driving many concerns, especially in the realm of compliance.
Basically, Big Data is a catchphrase that encompasses storage technology and the tools, processes and procedures that allow an organization to work with large data and, more importantly, perform Web analytics on that data. Examples of Big Data solutions at work include Google Analytics, the human genome project and Amazon’s product recommendation engine.
In other words, Big Data Web analytics are quite prevalent and are now popping up even in the small- and medium-sized business world. But there is a dark side that begs consideration: for Big Data analytics to work properly, tools and users must have unfettered access to large amounts of data, and therein lays the problem.
Compliance is all about protecting data, maintaining transactional continuity and concealing information from unauthorized sources. Big Data, on the other hand, is all about exposing data and mining that data for information. It almost seems that compliance and Big Data are polar opposites. Does that mean those technologies are mutually exclusive?
Not exactly. With some proper planning and comprehensive security, Big Data actually compliments compliance. One of the primary tenants of compliance is the ability to retrieve interrelated data for e-discovery purposes, often a time-consuming and expensive undertaking that is driven by a legal request.
Here, Big Data proves to be a valuable tool because it allows users to quickly retrieve information for e-discovery purposes. The mining process also lets users create relationships between data, while mining for other information that is applicable for an e-discovery request.
For example, data that is not normally related can be retrieved using ad hoc queries that build temporary relationships to combine filtered data sets. This allows an administrator to gather all information pertinent to a particular customer (including VoIP recordings, emails, IMs, documents, spreadsheets and so on) in a matter of minutes by leveraging the power of Big Data Web analytics. But to accomplish any of this, a Big Data platform must be in place. Luckily, open source solutions exist, such as Apache’s Hadoop, which significantly reduces startup costs.
Ultimately, all businesses needing to meet compliance requirements will turn to Big Data platforms, even if their data isn’t so big. Now is the time to look into platforms and solutions that power today’s Big Data analytics and to strengthen security so Big Data doesn’t become a big security problem.
Frank Ohlhorst is an award-winning technology journalist, professional speaker and IT business consultant with more than 25 years of experience in the technology arena. He has written for several leading technology publications, including Computerworld, TechTarget, PCWorld, ExtremeTech and Tom’s Hardware, and business publications including Entrepreneur and BNET. Ohlhorst was also executive technology editor at eWEEK and director of CRN Test Center.