One commonly held belief about big data is that it provides better and deeper answers to business questions. “And it does,” Ken Rubin, director of analytics at Facebook Inc., said at the recent Strata Conference + Hadoop World in New York.
But, Rubin argues, it’s a belief that also needs to be challenged. While businesses focus their energies on the answers big data will provide, they risk overlooking the important role that questions play. Put another way, if IT spends time solving problems that don’t ultimately provide any value to the business, the richness and depth of their resulting hard-won insights hardly matter. But pinpointing (and articulating) what those questions should be doesn’t necessarily come naturally.
“You can use science and technology and statistics to figure out what the answers are, but it’s still an art to figure out what the right questions are,” Rubin said.
That’s one of the reasons why Facebook, in addition to hiring people who have both technical and business savvy, sends its employees to “Data Camp.” “This two-week, intensive, immersive program teaches them everything they need to know about analytics,” Rubin said.
And everyone is on the bus: Product managers, designers, engineers and members of the finance and operations department all attend Data Camp. During that two-week stint, they learn how to use relevant tools and technologies but they also learn how to ask the right questions and how to “frame business questions in such a way that you can use data to get the answer,” Rubin said.
It may sound a little elementary, but giving employees this type of language lesson creates a lingua franca across the organization, which can help eliminate what some CIOs have cited as a barrier between IT and the business.
Another tip: Figure out the best organizational structure for analysts — centralized, decentralized or some combination of the two. Facebook implemented what it calls an embedded model, a hybrid between a centralized and decentralized structure, where business units roll up to a central team but analysts “physically sit with the organization they’re providing a service for,” Rubin said.
“The benefit there is that you get your common standards and processes from centralization, but you also get great alignment and connection to the goals,” he said.
Vendors are still plunking down big bets on Hadoop. Last week, Cloudera Inc., the first-ever Hadoop distributor, rolled out the beta release of Cloudera Enterprise 5. It’s the latest version of the company’s big data enterprise platform, which introduces what Cloudera is calling the “enterprise data hub.” Long story short: Welcome to another illustration on how to design for big data.
“We’ve seen a significant new trend emerging,” Mike Olson, chief strategy officer and former CEO of Cloudera, said at Strata Conference + Hadoop World 2013 in New York. “Hadoop is moving from the periphery to the architectural center of the data center.”
Powered by Apache Hadoop 2, Cloudera’s latest offering is built to help businesses execute on that trend. The platform includes a lot of bells and whistles, but it’s the concept of the “data hub” that’s snagging most of the attention. And for good reason: Hadoop as the heart of the data center suggests that the technology not only has the maturity but also the momentum to move into the mainstream.
Olson, for one, is a believer. “No matter your mission, no matter your business, you need to look at all your data, [and] analyze it together in new ways in order to make the very best possible decision,” he said. “You need a data hub.”
The beauty of Hadoop is its ability to ingest and process all kinds of structured and unstructured data quickly; that doesn’t change. Cloudera’s platform adds security, compliance and governance features to the mix, characteristics that are well-suited to — and much needed in — the enterprise. The platform enables businesses to store data in “full fidelity” for as long as they need, and also provides search, query and data discovery capabilities.
But, as Olson pointed out, “that ain’t a hub — not yet. A hub is at the center; it needs spokes; it has to connect to the other infrastructure you already rely on.”
His keynote provided another opportunity for businesses to hear that Hadoop (via Cloudera, in this case) isn’t killing off the relational database. Instead, the enterprise data hub is a hybrid architecture that marries the old and the new and “pragmatically extends the value of existing investments while enabling fundamentally new ways of delivering value from data,” according to the press release.
The phrase hybrid architecture is key, but it’s not unique. Gartner Inc. and Enterprise Management Associates Inc. have been talking about “extending” the enterprise data warehouse (EDW) with analytical, cloud or big data platforms for a couple of years now. Their frameworks encourage businesses to let data live where it performs best and to virtualize views into the data.
Olson is introducing a platform that still embraces a centralized location for data — but the location has shifted. Rather than primp, clean and format data for the EDW, the Cloudera platform acts as an initial staging ground — and beyond — where it can be stored, aggregated, indexed, tagged, you-name-it — and fed to the appropriate place such as a relational database.
“This idea, this enterprise data hub, these capabilities — this is far too powerful of an idea, this is far too correct an architecture not to win,” Olson said. “You will see this hub emerge at the center of most enterprises that manage data for a living.”
On paper, an enterprise data hub looks great; in practice, well, that usually turns out to be a different story. Just ask the EDW.
We probably didn’t need a study to tell us this, but in case there was any doubt, babies are bonkers for mobile devices. Tablets, smartphones, you name it, they want to rub their little mitts on it. Kinda gives a whole new meaning to “sticky user experience.” And that, in a way, is why an item about the Common Sense Media study is this week’s lead Searchlight item.
In 2011, about 10% of kiddies up to age 2 had used a mobile device. That number has jumped to 38% and shows no sign of slowing. This is the real digital generation; for them a great mobile experience will not be appreciated, it will be expected. And lest CIOs think they have plenty of time to work on this, remember: They grow up so fast!
Also this week: the mystery of the Google (?) barges, a cool new concept in data center temperature control, IT-themed Halloween costumes and more.
If you’re one of the thousands (millions?) of Americans who’ve run up against the mess that is federal health insurance exchange website you may be wondering — in colorful language — how such an important project could be such a disaster. But before you start cursing the coders and damning the developers, consider this: by all accounts there was very little accounting for what was going on during the site-building process. In this week’s lead Searchlight item, MIT Sloan School of Management’s Center for Digital Business research fellow Michael Schrage explains that blame lies not with those who worked on the site but with the fact that there was no project governance to guide them. But not all the news out of Washington this week is so glum: This week’s searchlight also highlights Congressional efforts to thwart devious data brokers and patent trolls. Keep reading to learn about Facebook’s dabblings in deep learning and Wikipedia’s hard lesson on the limits of collaboration.
Eric Schmidt, the executive chairman of Google Inc., admits it: The tablet caught him off guard. He made the confession during the 2013 Gartner Symposium/ITxpo in Orlando, Fla. Want a closer look? Here’s what he had to say during his keynote address about mobile enterprise computing, the tablet revolution and why businesses are going to need a whole new infrastructure.
I like to think of this in a historical context, having been a member of the enterprise software industry for so many decades. The first decade I would define as roughly the following: Sales teams sell seats on a per-seat basis; a sales person would typically have a million dollar quota; [that way] you could pay the sales person’s salary, pay a little to the engineers, and have a 20% margin. That’s roughly the model, and it was largely, in my view, developed by Oracle. They would often sell seats you didn’t deploy, and eventually, if you grew as a company, you would deploy them. You had a 5- or 10-year contract cycle with a 15% service model. That is largely the model that incumbents, if you will, of an enterprise are using today.
The second phase was the arrival of cloud computing co-existing with this intranet model. That term intranet was coined 25 years ago. So we had this model of the protected intranet, VPNs, gateways, software inside the corporation. And there was this other thing going on, which was inspired by companies such as Google and others, such as Amazon, in particular. I would argue that was a fair fight in the sense that there were a lot of benefits to the incumbents, the new guys were cheaper, better, faster in some ways, but they were different, they weren’t fully compatible, they weren’t fully integrated. And then something happened.
That’s why there’s this third phase. This third phase was really driven by tablets. I was actually surprised by this. I didn’t call this. To me, would the phone replace the worker in the corporation? I figured they would use the PC and the phone. But in fact it was the tablet revolution, and it looks to us like the majority of enterprise computing is being done on mobile devices, in particular on tablets.
That broke the model. It actually just broke it. One way to understand this, what does the new model look like? I’m afraid to say it, and I’m sorry to say it so bluntly, it looks like you’re going to have to dismantle much of that existing infrastructure and replace it by a model that actually works in this new tablet/phone/mobility model. It’s happening right before your eyes.
I’ll give you some examples. Somehow we as a group thought it was a clever idea to have VPNs from outside of the firewall, through the firewall and into the corporate network and we thought that would be secure. … The fact of the matter is, it’s crazy to imagine these open pathways through port 80 are secure. And indeed with modern tunneling what you can do, especially if you’re the Chinese, is get yourself into a downriver server — typically a Windows NT server, which is how it happened with most of the previous attacks because you haven’t upgraded those servers — write yourself a Windows NP certificate right through the VPN and off you go. It’s just a terrible architecture.
A much better architecture is to say we’re not going to have an intranet anymore. We’re going to have just the network and we’re going to make sure that any access is application to application.
The future is unwritten, sure, but there are still a few givens. Most likely, the sun will rise tomorrow, babies will cry and water will be wet. All the same, everyone loves a good bit of prognostication. And when it comes to peering into the tech looking glass, the good folks at the Gartner Symposium always deliver. This week’s Searchlight looks at the forecast as well as what folks were chatting about. This is just a tease, of course, we’ll be diving deeper really soon. Meanwhile enjoy items from this week’s news that reflect on those Gartner hot topics of innovation, cloud and data privacy.
ORLANDO, FLA. — Live from the 2013 Gartner Symposium ITxpo, Microsoft CEO Steve Ballmer discusses why Microsoft licensing is so complicated. In his tenth and final appearance wearing the CEO hat at the annual event, Ballmer pounced on the query before it even finished leaving the lips of Gartner analyst Tiffani Bova. Not everyone in the audience of IT leaders liked the answer, but in typical Ballmer style, he earned plenty of points for charisma.
Since this is my last [Gartner Symposium ITxpo] Mastermind keynote … let me opine on this topic. What we’ve learned about licensing is that the best thing we can do by and large, most of time, to make it simpler is to not change it. It turns out change is the number one problem with simplicity. The last major change we made — major by my standards — was about 10 years ago. I think we called it “licensing 6.0” and we made virtually every customer mad. They thought all of their prices had risen when that wasn’t even what we could tell our shareholders.
And why did we do licensing 6.0? We wanted to simplify. When we simplify you [the customers] have to go and parse through a number of new alternatives. Sitting here in the year 2013, I don’t look back — and our team is not looking backward — to simplifying licensing; what we’re doing is trying to look forward so that as we add Software as a Service cloud services options we’re not making things more complicated. We are trying to add those things seamlessly and to make sure that when we design our offerings for services that they are simpler to consume and to purchase than our software was.
You could say, “Well, how could cloud services possibly be complicated?” And the answer is, of course they can. If you just look today even at our Office365 pricing, we’ll have customers who say, “I want these two services, but I don’t want these three others.”
So how do we keep our forward view of what to do with services? In a hybrid world of software and services, we will optimize our simplicity gene against that forward-looking services aspect as opposed to trying to retrofit.
Like a Hollywood ingenue, now that data’s star is on the rise it can never expect privacy again. OK, perhaps the data is the new oil analogy is little stronger, but the point is data — personal, private and all kinds in between — is a hot commodity, that’s only going to get hotter as folks figure out how spin it into gold. And we’re not just talking about your local supermarket keeping tabs on your favorite soda brand. Hackers, spies and governments alike are collecting information and hording it like some sort of techno Doomsday Preppers. And the thing is, as this week’s lead Searchlight item shows, even when a company thinks it’s doing all it can to deal with data security issues, it’s the tiny overlooked details that’ll come back to bite you. Also in this week’s Searchlight: new takes on energy efficiency for data centers and mobile devices, why some at Microsoft want Gates to go and more.
When it comes to chatter about BlackBerry’s fantastic tumble there’s plenty being said about who did what wrong and at what point and why. But this week’s Searchlight looks at the whole sad story from a different angle: that it was inevitable and ultimately doesn’t matter because mobile innovation is dead. Wait a sec, dead? How could it be dead? This is all happening so fast! Exactly, says Wired‘s Marcus Wohlsen. But Searchlight isn’t all sad this week, we’ve also got Google’s well thought out answer to Siri, a self-interest soaked apology to the tech industry, bad BI practices to avoid and more.
There are so many product announcements — big and small — pretty much every hour of the day. Maybe that’s an exaggeration, but my inbox would beg to differ. That said, it’s easy for innovative items to pass by virtually unnoticed. This week’s lead Searchlight item focuses on one such announcement. While not the caliber of, say, a gold-colored iPhone, Box last week introduced the beta version of a new online content creation and collaboration tool that could (pundits propose) unseat — or at least compete with — the likes of Google Docs and possibly even Microsoft Word. But what’s really interesting is what this tool represents — another step toward the “appification” of the enterprise.
Also in this week’s Searchlight: Google can hardly be expected to focus on document collaboration tools when it’s busy warding off Death itself; the unrelenting audacity of cybercriminals and the new iPhone 5s; data centers’ unlikely new BFF and more.