Scream all you want about the cloud as a term, but I can define that for you in simple or more complex terms, but Big Data? Well, it’s just a rough approximation involving a large (I mean we’re talking very big indeed) number. As one speaker said at the Gilbane Conference this past week in Boston — and he was joking — it’s humongous. Why yes it is. It’s bigger than small.
But for all of the hype and the lack of clarity around the term, I get that it is a real trend no matter how squishy the term itself may be. Just the very idea of big data has to send shivers down the spines of data center operators everywhere, wondering if they have the chops — the right kind of databases, the storage capacity, the servers — to deal with big data, however you define that.
And that’s where the cloud comes in because by its very nature the cloud could be the missing link for large data sets. It’s elastic, meaning it can scale up to meet demand. It’s cost-effective because instead of doing it all yourself, you’re buying services from someone who is spreading the cost among customers based on actual usage.
And it’s not just an infrastructure play, it’s also a way of buying and selling the data itself in big data sets, or in the case of governments, giving it away. Take data.gov for example. It’s a regular treasure trove of data. You don’t need to pull all that data in-house because the government is kind enough to host it for you, and allow you to put your data analysis tools to bear on it to find the data nuggets that are most valuable to you and your business requirements — and whatever question you are trying to answer.
Data.com (not to be confused with the federal government’s site) is a site owned by Salesforce.com, a company that knows a thing or two about cloud services. With data.com, Salesforce is attempting to be your Big Data dealer. It cultivates, hosts and sells it by the pound (so to speak). You just have to ask the right questions and take advantage of its data.
Then there is the whole concept of Tim Berners Lee’s semantic web or as he likes to call it, a web of data. In this concept we share data sets much the same way we share documents via a hyperlink. Check it out his 2009 Ted Speech on this subject.
It’s been a couple of years and we’ve yet to achieve that vision, but if you could, you could imagine some pretty exciting things happening as scientists share their research around a particular problem like AIDS research or a cancer cure. It could accelerate our learning extremely quickly if it came to pass (big if I know).
But there are more services out there offering data, lots of data, dare I say big data — as a service in the cloud (or on the web or whatever you wish to call it) and the cloud could be the glue that holds this whole thing together. That’s because chances are, most companies — short of Google, Facebook or IBM — aren’t going to have the resources or computing power to do this alone.
And the cloud may be the answer to the problem — even if you don’t know you’re looking for a Big Data solution yet — or even what that entails or means.