Vendors are still plunking down big bets on Hadoop. Last week, Cloudera Inc., the first-ever Hadoop distributor, rolled out the beta release of Cloudera Enterprise 5. It’s the latest version of the company’s big data enterprise platform, which introduces what Cloudera is calling the “enterprise data hub.” Long story short: Welcome to another illustration on how to design for big data.
“We’ve seen a significant new trend emerging,” Mike Olson, chief strategy officer and former CEO of Cloudera, said at Strata Conference + Hadoop World 2013 in New York. “Hadoop is moving from the periphery to the architectural center of the data center.”
Powered by Apache Hadoop 2, Cloudera’s latest offering is built to help businesses execute on that trend. The platform includes a lot of bells and whistles, but it’s the concept of the “data hub” that’s snagging most of the attention. And for good reason: Hadoop as the heart of the data center suggests that the technology not only has the maturity but also the momentum to move into the mainstream.
Olson, for one, is a believer. “No matter your mission, no matter your business, you need to look at all your data, [and] analyze it together in new ways in order to make the very best possible decision,” he said. “You need a data hub.”
The beauty of Hadoop is its ability to ingest and process all kinds of structured and unstructured data quickly; that doesn’t change. Cloudera’s platform adds security, compliance and governance features to the mix, characteristics that are well-suited to — and much needed in — the enterprise. The platform enables businesses to store data in “full fidelity” for as long as they need, and also provides search, query and data discovery capabilities.
But, as Olson pointed out, “that ain’t a hub — not yet. A hub is at the center; it needs spokes; it has to connect to the other infrastructure you already rely on.”
His keynote provided another opportunity for businesses to hear that Hadoop (via Cloudera, in this case) isn’t killing off the relational database. Instead, the enterprise data hub is a hybrid architecture that marries the old and the new and “pragmatically extends the value of existing investments while enabling fundamentally new ways of delivering value from data,” according to the press release.
The phrase hybrid architecture is key, but it’s not unique. Gartner Inc. and Enterprise Management Associates Inc. have been talking about “extending” the enterprise data warehouse (EDW) with analytical, cloud or big data platforms for a couple of years now. Their frameworks encourage businesses to let data live where it performs best and to virtualize views into the data.
Olson is introducing a platform that still embraces a centralized location for data — but the location has shifted. Rather than primp, clean and format data for the EDW, the Cloudera platform acts as an initial staging ground — and beyond — where it can be stored, aggregated, indexed, tagged, you-name-it — and fed to the appropriate place such as a relational database.
“This idea, this enterprise data hub, these capabilities — this is far too powerful of an idea, this is far too correct an architecture not to win,” Olson said. “You will see this hub emerge at the center of most enterprises that manage data for a living.”
On paper, an enterprise data hub looks great; in practice, well, that usually turns out to be a different story. Just ask the EDW.