As 2017 winds down, we invite you to take a look behind the big data curtain. There, you will find data engineers, data scientists, end-users and others working to move a big data concept into production. It doesn’t take much digging to find that more self-service capabilities are needed at each stage in the data life cycle.
That is among the take-aways from this latest edition of the Talking Data Podcast. In this and a subsequent episode, Ed Burns and I discuss recent user stories that graced the editorial pages of SearchBusinessAnalytics.com and SearchDataManagement.com – ones that speak to some of the outstanding trends of the year just winding down.
One of the telling threads we found was self-service; that is, self-service as it relates to ETL, as it relates to interactive data queries, and as it relates to cluster configuration. In the latter case we have as example restaurateur Panera Bread. The chain is among the company’s with particularly aggressive web initiatives underway.
More and more, when lunchtime arrives, incoming orders come in via cell phone. That can stress operational systems. Aware of this threat, Panera Bread built a Spark-Hadoop system to analyze computing needs for the processing involved in handling the lunchtime crush. It was the first in a series of Hadoop apps that Panera is spinning up quickly, after deciding to use automated container configuration software.
Panera announced earlier this year that annual digital sales had gone past $1 billion, and that projected digital sales could double by 2019. The ability to let individuals spin up big data jobs at will become handier going forward, one of the company’s engineering leads said.
Self-service that empowers more individuals in the data pipeline is a fact of life that IT has generally come to accept. It seems now to be a big part of moving at the speed of innovation. Listen to this podcast and feel free to come back for seconds.