Big Data image via Shutterstock
By James Kobielus (@jameskobielus)
All things considered, big-data platforms are not any more or less secure than established, smaller-scale databases.
Where security is concerned, the sheer volume of your data is not the key factor to lose sleep over. Instead, the chief security vulnerabilities in your big-data strategy may have more to do with the unfamiliarity of the platforms and the need to harmonize disparate legacy data-security systems.
Platform heterogeneity is potentially a big-data security vulnerability. If you’re simply scaling out your an existing DBMS, your current security tools and practices probably continue to work well. However, many big-data deployments involve deploying a new platform–such as Hadoop, NoSQL, and in-memory databases–that you have never used before and for which your existing security tools and practices are either useless or ill-suited. If the big-data platform is new to the market, you may have difficulty finding a sufficient range of commercial security tools, or, for that matter, anybody with experience using them in a high-stakes production setting.
Platform heterogeneity is also a potential big-data security vulnerability. Some organizations implement big data as a consequence of consolidating inconsistent, siloed data sets. In the process of amassing your big-data repository from heterogeneous precursors, you will need to focus on the thorny issue of harmonizing disparate legacy security tools. These tools may support a wide range of security functions, including authentication, access control, encryption, intrusion detection and response, event logging and monitoring, and perimeter and application access control.
At the same time as you’re consolidating into a big-data platform, you will need to harmonize the disparate security policies and practices associated with the legacy platforms. Your big-data consolidation plans must be overseen and vetted by security professionals at every step. And, ultimately, your consolidated big-data platform must be certified according to all controlling enterprise, industry, and government mandates.
And if you want to approach big-data security holistically, your strategy should also include tools and procedures for unified (hopefully real-time) monitoring of security events on disparate data sets. You should consider implementing a logical big-data mart for security incident and event monitoring (SIEM) to identify threats across your disparate big-data platforms, consolidated and otherwise.
To the extent that you’re managing customer, finance, and other system-of-record data on your big-data clusters, you should certainly consider the need for strong SIEM. If you don’t, your enterprise’s chief information security officer will almost certainly, at some point, ask why you haven’t.